If you are looking for software to use, go to Huajun Software Park! software release

Hello, if there is a need for software inclusion, please package the software and attach the software name, software introduction, software-related screenshots, software icon, soft copy, and business license (if you do not have a business license, please provide the front and back of the corresponding developer ID card) and a photo of yourself holding your ID card) and send it to your email http://softwaredownload4.com/sbdm/user/login

Close >>

Send to email:news@onlinedown.net

Close >>

Location: front pagePC softwareNetwork softwareproduction assistance Red leaf article collector
Red leaf article collector

Red Leaf Article Collector 3.6 Green Edition

QR code
  • Software licensing: free software
  • Software size: 5.0MB
  • Software rating:
  • Software type: Domestic software
  • Update time: 2017-01-20
  • Application platform: WinAll
  • Software language: Simplified Chinese
  • Version: 3.6 Green version

Download the service agreement at the bottom of the page

Software introduction Related topics FAQ Download address

Recommended for you:- Red leaf article collector

Basic introduction
Red leaf article collector paragraph first LOGO
super powerful websiteArticle collector, the full name of this software is Hongye Article Collector, and the English name is Fast_Spider. It is a spider-like program that is used to collect a large number of essential articles from designated websites. It will directly discard the junk web page information and only save the essence with reading value and browsing value. Articles, automatically perform HTM-TXT conversion. This software is green software and can be used after decompression!

Software features

(1) This software adopts Peking University Tianwang MD5 fingerprint deduplication algorithm, and will no longer save similar and identical web page information repeatedly.

(2) Meaning of collected information: [[HT]] represents the title of the web page, [[HA]] represents the title of the article, [[HC]] represents the 10 weighted keywords, [[UR]] represents the image link in the web page, [[UR]] represents the image link in the web page, [TXT]] is followed by the main text.

(3) Spider performance: This software opens 300 threads to ensure collection efficiency. The stress test is performed by collecting 1 million essential articles. Taking the Internet-connected computers of ordinary netizens as the reference standard, a single computer can traverse 2 million web pages and collect 200,000 essential articles in one day. It only takes 5 days to collect 1 million essential articles. complete.

(4) The difference between the official version and the free version is that the official version allows the collected essence article data to be automatically saved as an ACCESS database. To purchase the official version, please contact QQ (970093569).

How to operate

(1) Before use, you must ensure that your computer can connect to the network and that the firewall does not block this software.

(2) Run SETUP.EXE and setup2.exe to install the operating system system32 support library.

(3) Run spider.exe, enter the URL entry, click the "Manual Add" button first, and then click the "Start" button to start the collection.

Things to note

(1) Crawling depth: Fill in 0 to indicate no limit to the crawling depth; fill in 3 to capture the third layer.

(2) The difference between the general spider mode and the classified spider mode: Assume that the URL entry is "http://youxi.baidu.com/", if you select the general spider mode, every web page in "baidu.com" will be traversed; if Select the category spider mode to only traverse every web page in "youxi.baidu.com".

(3) Button "Import from MDB": URL entries are imported in batches from TASK.MDB.

(4) The principle of collection by this software is not to cross the site. For example, if the entrance is "http://youxi.baidu.com/", it will only crawl within the Baidu site.

(5) During the collection process of this software, one or several "error dialog boxes" will occasionally pop up. Please ignore them. If you close the "error dialog box", the collection software will hang.

(6) How users choose to collect topics: For example, if you want to collect "stock" articles, you only need to use those "stock" sites as the URL entry.

FAQ

closure