Ontology plays an important role in locating Domain-Specific Deep Web
contents, therefore, this paper presents a novel framework WFF for
efficiently locating Domain-Specific Deep Web databases based on focused
crawling and ontology by constructing Web Page Classifier(WPC), Form
Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical
fashion. Firstly, WPC discovers potentially interesting pages based on
ontology-assisted focused crawler. Then, FSC analyzes the interesting pages
and determines whether these pages subsume searchable forms based on
structural characteristics. Lastly, FCC identifies searchable forms that
belong to a given domain in the semantic level, and stores these URLs of
Domain- Specific searchable forms to a database. Through a detailed
experimental evaluation, WFF framework not only simplifies discovering
process, but also effectively determines Domain-Specific databases.