Your cart is currently empty!
Published:
·
Last modified:
Swamped by text data that you need to quickly pull the key elements of? You’re not alone. In almost any organic strategy project, ranging from a content audit, keyword research, or competitor analysis, to the more user-centric analyses like feedback or comments analysis for product ideation, entity analysis can often save you tons of time and unlock amazing insights. With the help of an Entity Extraction Machine Learning API, you can automatically identify and extract entities (such as people, places, and organizations, and more) from free-form texts in seconds.
Today, I’ll show you how to use Google’s Natural Language API’s entity module in Google Sheets on any text data – long or short-form, or otherwise – to extract the entities from anything as short as a title or meta description through product descriptions or even from articles. The tutorial is suitable for complete beginners – people with no coding experience or having not done entity analysis before, as it requires no prior technical skills.
Entity extraction is a core Natural Language Processing (NLP) task that automatically identifies and classifies key elements within text data. These elements, called named entities, fall into predefined categories like people, organizations, locations, and dates.
Entity extraction is a supervised machine learning problem and it involves feeding the model with labeled examples. For example, training a model to find names of people in news articles. The training data would include articles where names are already highlighted. By analyzing these examples, the model learns to identify patterns and replicate them to find similar entities in new, unseen text. This is then extended to other types of entities, and amplified to include more training examples and data.
To custom-train an entity extraction model, you can leverage various algorithms, from traditional methods like regular expressions to advanced deep learning techniques using neural networks. Word embeddings, which capture the contextual meaning of words, can also be employed to improve the model’s accuracy in pinpointing the correct entities.
Entity extraction models are pivotal for sifting through vast amounts of text data, enabling structured data extraction, enhancing content discoverability, and unlocking valuable insights from text by identifying key entities.
The Natural Language API draws from a vast library filled with knowledge about language structure, grammar structure, sentiment, and real world entities. It’s trained on massive amounts of text data, allowing it to:
Google Cloud’s Natural Language API offers an Entity Analysis module that analyses text and extracts mentioned entities. The analyzeEntities
module identifies and classifies named entities within the provided text. These entities can be people, organizations, locations, dates, events, works of art, and more – a comprehensive list of entity types is available in the API documentation.
Since the same API also has the capability to detect sentiment, it also provides entity sentiment scores (emotion label), which can provide more insight into the emotional context, and entity sentiment magnitude (emotion strength).
The API response of each call also provides details like:
This comprehensive entity analysis empowers you to gain a richer understanding of the content and the relationships between entities within your text data.
Check out the additional resources by Google Cloud to practice working with this API, and the entity analysis module specifically:
Having selected your Google Cloud project, navigate to the APIs and Services menu > Credentials.
Then, click on the Create Credentials button from the navigation next to the page title, then select API Key from the drop-down menu.
This is the easiest to use, but least secure method of authentication – you might consider alternatives for more complex projects.
Once you click on the Create API key button, there will be a pop-up menu that will indicate that the API key is being created, after which it will appear on the screen for you to copy.
You can always navigate back to this section of your project, and reveal the API key at a later stage, using the Show Key button. If you ever need to edit or delete the API key, you can do so from the drop-down menu.
The next step is to decide on and organise the content you want to extract entities from into Google Sheets.
For a no-code content scraping approach, I recommend using Screaming Frog’s custom extraction function. The approach works in three simple steps:
With this approach, you can quickly get a dataset of scraped content from web pages, or the HTML, depending on the extraction method you select.
You can also scrape content via alternative methods, using Python or third-party tools.
Once, you have your content extracted and organised into a spreadsheet-suitable format, you can move on to the next step.
This Google Sheets template, enhanced with an Apps Script, utilizes the Google Cloud Natural Language API for entity analysis. It enables users to identify and categorize entities within text directly in Google Sheets, streamlining data processing and insight gathering without the need for complex coding. Ideal for extracting names, places, brands, and more from large text datasets, this tool simplifies data analysis tasks, making it accessible and efficient for users of all skill levels.
To prepare the data for analysis, we need to do two things – organize the content for analysis, and paste the API key in the script.
In Google Sheets, open the Extensions menu, and click on Apps Script.
Open the entityextraction.gs script attached, and select the text that says enterAPIkey. Replace it with your Google Cloud API project key. Then click on the disk icon to Save, and return to the Google Sheet file.
Paste your content for analysis in the Working Sheet, keeping the URL and content. Keep the top-level navigation structure on the sheets Working Sheet, as well as in the sheet Entity Sentiment Data (meaning – don’t make edits to the column names).
To run the analysis, see the top menu, titled Sentiment Tools, then click on Mark Entities and Sentiment. A pop-up notification will appear at the bottom right part of the screen, notifying you the analysis has started. For each entry, a ‘complete’ sign will appear in column C.
Important: A pop-up screen might appear, asking you to give permissions for the script to run.
You can now switch over to the sheet titled “Entity Sentiment Data”, to review the output of which entities Google Cloud Natural Language API has identified from your content, including all the associated entity data like the entity type, salience, sentiment score, sentiment magnitude, number of mentions, mentions (examples), and metadata.
Although this step is optional, it is highly recommended that you visualize this data. For this purpose, I’ve created a handy Entity Analysis Looker Studio Dashboard Template, which allows you to:
There are several advantages to using Google’s Natural Language API for entity analysis within Google Sheets:
Getting the data is one thing, learning how to analyse it and use it as part of Organic Search strategy is another. Here are just some of the projects, where entity analysis can be pivotal to a good organic search strategy:
See the follow-up resources to learn how to harness this data to improve your strategy:
As mentioned at the start, the Natural Language API has several additional capabilities that include text, entity sentiment analysis, document sentiment analysis, and syntax analysis. Explore other step-by-step guides on this topic by visiting the resources, linked below:
Beginner Google Cloud Natural Language API Google Colab (Python) Google Sheets (Apps Script) OpenAI API Whisper API
Share this post on social media:
Leave a Reply