Hello and welcome to part II in the annual search patent round-up here on SNC. While some might discount what goes on over at MS/Bing, that’s not the point of the exercise. First off, Bing is still a viable engine with a large user base. Second, some markets actually get better conversions from Bing than from Google. And lastly, a search engineer is a search engineer. We’re looking at how challenges in search are handled by IR geeks… no reverse engineering algorithms… m’kay?
The Year in Search Patents from Microsoft
Over at Microsoft this year we did see some similarities between the awards between Google and MS, (see yesterday’s post on Google Patents 2011).
While not as robust as Google, they certainly had a good number of geo-local patent awards and a large number of social ones as well, as did Google. This certainly makes sense in that this is a huge focus over the last few years. Behavioural was also strong, which is also not surprising given the leaps in understanding user behaviour (and MS’s interest in personalization) over the last few years.
Interestingly, Microsoft had a bit of a lead in the land of named entities. This is one area I have been personally watching over the last 18 months and seems likely to be one of the more important areas moving forward.
Anyway, here’s the list of goodies from MS this year;
Geo-local
- Search interface for mobile devices
- Inferring user-specific location semantics from user data
- Custom Local Search
- Disambiguating residential listing search results
- Identifying location names within document text
- Location context mining
- Location-aware query based event retrieving and alerting
Semantic and NLP
- Context-based document search
- Iterators for applying term occurrence-level constraints in natural language searching
- Contextual queries
- Semantic object characterization and search
- Semi-automatic example-based induction of semantic translation rules to support natural language search
Social
- System and method for high-density interactive voting using a computer network
- Social distance based search result order adjustment
- Generating activities based on social data
- Segmentation and profiling of users
- Social network notifications for external updates
- Microblog search interface
- Social Network Search
- Leveraging communications to identify social network friends
- Social Network System with Recommendations
- Event Matching in Social Networks
- Social Media playback
- Adaptable relevance techniques for social activity streams
- Integrating a Search Service with a Social Network Resource
- Social home page
- Influence assessment in a social networks
Named Entities
- Webpage entity extraction through joint understanding of page structures and sentences
- Providing comparison experiences in response to search queries
- Named Entity Recognition in Query
- Applying model of a persona to search results
- Web-scale entity relationship extraction
- Scalable Incremental Semantic Entity and Relatedness Extraction from Unstructured Text
- Identifying location names within document text
- Comparisons of entities of a particular type
Query Analysis
- Incremental query refinement
- Query expansion through searching content identifiers
- Long query retrieval
- Detecting zero-result search queries
- Determining preferences from user queries
- Detecting Spiking Queries
- Context-aware query classification
- Query correction probability based on query-correction pairs
Behavioural
- Constructing web query hierarchies from click-through data
- Query classification based on query click logs
- Inferring user-specific location semantics from user data
- Leveraging global reputation to increase personalization
- Personal data mining
- Personalizing a search results page based on search history
- Predicting and using search engine switching behavior
- User role based customizable semantic search
- Using behavior data to quickly improve search ranking
- Ranking search results using click-based data
- Active prediction of diverse search intent behavior
- User modification of a model applied to search results
- Context-aware query classification
- Learning user intent from rule-based training data
- Federated implicit search
- Defining user intent
Duplicate Content
Verticals/Universal; Video, News, Images, Blogs, Forums, Ecommerce
- Scoring relevance of a document based on image text
- Multimedia search engine
- Contextual Image Search
- Visual Search Reranking
- Intelligent Image Search Results Summarization and Browsing
- Shopping search engines
Recommendation engine
- Using link structure for suggesting related queries
- Recommending queries when searching against keywords
- Query suggestion generation
- Providing query suggestions
- Reccomendation ranking system with distrust
- Automatic query suggestion using sub-queries
Temporal
Categorization
Spam
- Cloaking detection utilizing popularity and market value
- Link spam detection using smooth classification function
- Using content analysis to detect spam web pages
- Locally computable spam detection features and robust pagerank
- Identifying malicious queries
Question / Answer
- Clustering question search results based on topic and focus
- Searching questions based on topic and focus
Semantic markup
- Searching with metadata comprising degree of separation, chat room participation, and geography
- Extracting structured data from web queries
Links
- Using link structure for suggesting related queries
- Explicit and non-explicit links in a document
- Using Anchor Text With Hyperlink Structures for Web Searches
- Manipulation and management of links and nodes in large graphs
Page segmentation
- Adaptive page layout utilizing block-level elements
- Document page segmentation in optical character recognition
Authority
This and That…
These are the somewhat less specific areas which are the nuts and bolts of a search system. While not as targeted as the above categories, they do make some good reading.
Ranking methods
- Combining and re-ranking search results from multiple sources.
- Scoring relevance of a document based on image text
- Using categorical metadata to rank search results
- Flexible indexing and ranking for search
- Ranking oriented query clustering and applications
- Custom ranking model schema
- Optimization of discontinuous rank metrics
- Techniques to perform relative rankings for search results
- Training an ranking function using propogated document relevance.
- Learning diverse rankings over document collections
- Learning a ranker to rank eintities with autmoatically derived domain-specific preferences
- Topics in relevance ranking model for web search
- Detetion of junk in search results ranking
- Semi-Supervised Page Importance Ranking.
Systemic
- Method for administrating data storage in an information search and retrieval system
- Web Searching
- Interleaving search results
- Web content mining of pair-based data
- Data-Centric Search Engine Architecture
- Predicting future queries from log data
- Generating search result summaries
- Augmented query search
- Automatic diagnosis of search relevance failures
- Information retrieval system with customization
- Providing search results in response to a search query
Other
- Experimental web search system
- Learning Term Weights from the Query Click Field for Web Search
- Context aware searching
- Web content mining of pair based data
- Bootstrap and adapt a document search engine
- Domain collapsing of search results
- Retrieval of structured documents
- Calculating web page importance
- Augmenting television media
More goodies for geeks
Here’s the last two years worth of round-ups;
Also, be sure to see the rest of the series including Google Patents 2011 and Yahoo Patents 2011 (coming out tomorrow).
Happy Holidays… and have a great 2012!!



