A
database of categorized web sites currently includes more
than 60 million entries. This is the largest database
available for any commercial URL filtering system.
To
assign web pages to categories they are analyzed using
sophisticated classification techniques such as:
Text Classification: Web pages are rated
using factors such as the frequency of word occurrences
and word combinations.
Optical
Character Recognition: Text on images is captured
and analyzed.
Visual
Object Recognition: Symbols, logos, and trademarks
are used to categorize web sites.
Porn
Detection: Flesh tone images and face recognition
are used to identify pictures with high concentrations
of non-facial flesh.
The
database classifies web pages in 13 languages.
If
a user requests a web page that is not included in the
database, the URL is sent to the web crawlers and classified
within 24 hours.