The competitiveness of words takes on new meaning when you are responsible for achieving the top spots in search engine results, seeking high click-through rates and looking for new ways to separate from the pack in an effort to create your own identity and brand. Sometimes you simply choose to create a new word that has no previous meaning whatsoever.
To help that word gain traction, it is often tied to a familiar set of words that help define it. For example, Google. Before it became a verb, it was a word that needed an assist with terms we understood, like “search” and “engine”. Today, in a Twitter comment, “Google” is understood on its own. No assistant hash tag is required. But if the tweet was about something that Google released, “Google” is not mentioned and the hash tag “#searchengine” is the only clue, there is not enough related subject matter to know which search engine is being discussed.
As a search engine marketer or content writer you learn to figure out the best keywords for marketing strategies, article titles, anchor text, navigation labels and domain names. Keyword strength is measured and analyzed and decisions are made on how, where, and when to use them. That’s where the digging may stop.
It’s not enough. In fact, the study of words is rather ancient and the ways in which we analyze them today for indexing, categorization and algorithms is a popular field of study.
As poetic as it sounds, the aboutness of words is said to have been first defined in 1969 as a way to describe a variety of topical relationships such as the relationship between a document and its subject or a word to another subject word. Aboutness describes the relationship between a word and the subject areas associated with the word. High aboutness is a word with a strong subject association and these make good search terms. Low aboutness words are also known as stop words, like “and”, “the” and “an.”
Since not all words are equally valuable as search terms we look for those with high aboutness values. Every good content writer knows how to strictly edit out weak words and this is one of the reasons why. Every word choice should count. An aboutness coefficient is developed to estimate the strength of the aboutness relationship. This is the part where most of us are challenged because the relationships aren’t fixed or even stable. We’re targeting people, after all.
Take the word “blind”. By itself the word has different meanings. The strength of the aboutness increases when we add other words with it, such as “blind eye”. Now we know the subject may be about a human eye that can no longer see clearly. But, it could also be part of the popular phrase “Turning a blind eye”, which changes its meaning and injects a possible emotional state. We could use the word “blind” in a phrase like “going in blind” or call a test “double blind” or search Amazon for a “blind” for a window treatment. Sometimes the word may be perceived as less than desirable in certain contexts where “sight impaired” may be the better term for the target reader.
The aboutness of a word is fortified not only by the strength of the words it is related to but also by how it is accepted and interpreted by the reader. This ties it to user experience and desirability.
One of the unusual discoveries in my website usability and conversions site audit work is the occasional situation where after reviewing the site I still have no idea what they are selling. In one case, the product was software, but nowhere in the content was the word “software” used. For another audit, product names were used but they didn’t accurately explain by themselves what the product was or what it did. When used as navigation labels, the words had no meaning. Someone may have thought they were searchable brand terms but for a startup or newcomer, they had no understandable meaning.
Other empty words used in navigation are “Products”, “Services” and “About”. I can’t think of anyone who searches for “products” or “services” or “about” in a search engine or finds them to be easy wayfinding clues. The aboutness of those terms is weak.
Sometimes I’m provided with a list of keywords that are weighed and organized for use throughout the site. There are different ways of determining their rank and strength but I often find the words to be empty vessels devoid of a story.
Quantifying the Aboutness of Words
Since many of you are data people, and certainly fascinated with algorithms, you may find this study interesting. Called The Aboutness of Words, the authors of the study wanted to identify useful search terms that were not represented in the Library of Congress Subject Headings or the Faceted Application of Subject Terminology (FAST) vocabularies and could not be matched through other mechanisms. One of their goals was to tackle the assumption that words that are randomly distributed across subject areas lack aboutness, whereas words whose usage is concentrated in a few subject areas have aboutness.
Which leads me to how certain content quickly becomes hot news.
The Nobody Cares Until Everybody Cares Event
I write about accessibility topics, which next to usability, is just about the very last piece of web site design information companies want to pay for. The recent articles I wrote to help get the word out that the USA is going to vote on reducing the rights of disabled people garnered a few caring shares but otherwise even the major news outlets ignored the coverage put out by the ACLU and websites that cover ADA topics.
What makes content turn into fireworks? Aside from all the tricks of the trade you’re all thinking of right now, a study on microblogging provided some food for thought.
Microblog analysts did some research with Twitter, where they wanted to find insights and events of significance hidden among the volumes of text. How soon does an event become hot news? How is that determined? How does one wrestle content seen by untold millions of people around the world and determine if it relates to a crisis, breaking news or epidemic situation?
Topic extraction is ongoing and the studies and applications like Twitter Rank are numerous. To understand the nature of this form of social network, datasets are designated such as volume, diversity of content, brevity, absence of structure, time sensitivity, and author information. Some analysts argue that content length brings better results, whereas others feel authorship is the influencer.
Another application of microblog analytics is capturing real world events in near real time by adding the location and specific time, but this fails in a global environment so sometimes location is excluded. Keywords like earthquake may be tracked for detection. Re-tweets and keyword clusters are also used for datasets.
Another proposed technique, topic pathway separation and event detection, was presented in a study called Automatic Event Detection in Microblogs Using Incremental Machine Learning, They define this as “A topic pathway is a series of microblogs that discuss a common topic.” A “topic segment” is a part of the topic pathway that belongs to a particular batch. They created an algorithm that learned semantically similar cluster chains across batches of microblogs. They wanted to track sentiment.
One of the findings they demonstrated with their setup was that topic segments provide information about the sub topics which appear across different time periods. For events, they could monitor intensity volume made up of three event indicators for volume, and positive and negative sentiment.
There was much more to that study that may inspire you to try something different or new to apply to your own practices.
Automatic Event Detection in Microblogs Using Incremental Machine Learning http://onlinelibrary.wiley.com/doi/10.1002/asi.23896/full
The Aboutness of Words