The Art of Searching

Altamira TC
Altamira TC
33.6 هزار بار بازدید - 13 سال پیش -
http://www.nearinfinity.com

Tom Neumark presents on searching.

Blur, which is based on Apache Lucene, is a search engine capable of searching billions of records quickly. The underlying data structures and algorithms that make Lucene work are build from simple structures that are built up piece-by-piece to enable more sophisticated functionality such as comparing documents through cosine similarity. In this talk, we'll start with a simple search example and build up to a vector space model and explain some of the underlying math needed normalize the weights used to make fair comparisons among documents.
13 سال پیش در تاریخ 1390/09/25 منتشر شده است.
33,621 بـار بازدید شده
... بیشتر