web n. 1.蜘蛛网;蛛网状东西,网状组织;【纺织;印染】棉[毛]网;〔the W-〕【计算机】万维网(= World Wide Web)。 2.(一匹)布,织物。 3.编织品,捏造的东西;做成的圈套。 4.(水鸟的)蹼,掌皮;羽瓣。 5.一大卷[筒]印刷用纸。 6.【机械工程】连结板;金属薄片[薄条];【建筑】工字梁腹(部);圆拱;【解剖学】膜。 a web of railroads 铁路网。 a web of lies 一大片编造的谎话。 a web of life [destiny] 命运。 the web of a saw 锯身。 the web of a sword 刀身。 a twist web 【机械工程】麻花钻心。 vt. (-bb-) 1.在…上织网;丝网般密布在…上。 2.用丝网网住;使入圈套。 vi. 1.形成网状。 2.织丝网。
Search engines technique does meet the requirement for searching information in the web at these two aspects ? ? web content - mining and web structure - mining . here , traditional search engine is chiefly composed of three parts : web crawler ? ? > indexing engine ? ? > search ranking first , the paper analyzes and studies the general techniques of these three parts above , and then develops and reforms them for the sake of web news in order to structure special chinese news search engine . the major study job of the paper focuses on the five aspects as followed : 1 再从抽象到一般,先分析和研究这三个部分各自传统的通用技术,然后从web新闻的特点出发对这些通用技术进行改造和发展,构架出特有的中文资讯搜索产品:网络爬虫的任务是在internet上自动的漫游并根据html文档上中超链接,抓取网页,并对下载到的网页进行分析和抽取,得到网页的信息。
With information rapidly expanding in the web , many web - related services came in increasing numbers accordingly . web information is applied in many fields , and the request of people becomes more and more rigorous , so the web crawler who has charge of gathering of web information is facing a real challenge . a number of big companies overseas or domestic have solved it in a perfect way , and their products have been in practical use , whereas , the search engine at large - scale can only supply a common service which cannot be customized 国内外的一些大公司对这一问题已经有了很成熟的解决方案,并已投入使用,但是这些大型搜索引擎只能给大众用户提供一种普通的不可制定的搜索服务,它不可能考虑到所有用户的各式各样的需求,而单机的网络爬虫在很多情况下又难当重任,中型规模的网络爬虫以其灵活的可定制性和单机网络爬虫无法比拟的信息采集速度和规模,满足了人们日益增长的对web信息的面向用户的需求,针对这一情况,本文展开了对国内外的网络爬虫技术的研究。