Enterprise Search: Google vs. Exalead vs. Oracle

by Alistair Miles

Some interesting discussion in the session “Latest Developments in Enterprise Search” at Online Information yesterday (day 2, track 2, session 1)…

Francois Bourdoncle presented the Exalead approach; Roger Ford presented Oracle Secure Enterprise Search (SES); Roberto Solimene presented Google OneBox.

All three speakers emphasised being able to search different types of both structured and unstructured information. Surprisingly, none of the speakers talked about how their products achieve high precision (relevancy). The discussion after the talks was perhaps most interesting, highlighting two major issues in enterprise search, relevancy and privacy … here’s my raw notes taken at the time…

Question: What made Google web search good was the fact that it exploited the topology of the hyperlink structure to improve relevancy through ranking. How do your products exploit the topology of the information inside an enterprise to improve relevancy for enterprise search?

Francois – Link topology is only part of relevancy problem inside an enterprise, no linking in e.g. databases – future lies in webification of enterprise info – URL addressability of all sources. Link topology does not translate to enterprise information, myth that we will be able to get the right answer straight away, need new ways to interact with information – search by serendipity. Immediate pleasure … need more than link analysis.

Roger Ford – relevance is critical, link analysis can play a part – still important in the enterprise. We use metadata to improve relevancy – e.g. are keywords in title or body, close together … ? Been working on this problem with oracle text for many years. Looking into the future, we’re looking at how people interact with the search engine, e.g. if number of people have clicked on doc in hitlist, boost relevancy according to how people use search engine, also integrate with social networking e.g. person in a dept has found something others in same dept will have similar interest. Click analysis not in now but in future.

Roberto Solimene – yes what made google is pagerank, we are not using exactly the same algorithm in the enterprise, we hav emore than 100 algorithms that can be tuned up or down depending on the types of documents and types of information. our customers are very satisfied with the ranking they have, we invite everyone to test google appliance, confident in satisfaction on relevancy.

Question: Queries that trigger functions behind the scenes that you may not be aware of?

Roberto – onebox has reporting function, capability to implement google analytics – behaviour of user on the website. Search box will not get smarter from watching search behaviour, you want good consistent results.

Roger – acquired company that monitored what documents you opened on your desktop, but how far to take the monitoring? It’s a discussion we have to have.

Francois – essential that people know they are being monitored, cf brakes of your car – do not learn from the way you brake. people want predictable systems, so we propose wysiwyg interface with consistent results to queries, get the same thing from one day to the next. anti-phishing send url of your browser and send to central server! head of google spain in lisbon recently, talking about personalised google, google how knows you, people should be keeping an eye on these things, when you have a company that knows what you are looking for, how you are paying, what you are buying, who you are chatting with, who you are emailing – be really careful! Forget about promise of delivering right thing in a couple of keywords. Let people organise knowledge collaboratively.