Locoy Spider latest version
Locomotive collector (Locoy The latest version of Spider is a collection tool with multiple functions such as processing, analysis, and mining. Locomotive collector supports collecting 99% of web pages, and the speed is 7 times that of ordinary collectors. Locomotive collector (Locoy Spider) can also support remote downloading of image files and information collection after website login. Huajun Software Park provides locomotive collector (Locoy Spider) software download service, to download other versions of the locomotive collector software, please go to Huajun Software Park!
Introduction to train collector software
1. Supports all website encodings: Locomotive Collector perfectly supports collecting web pages in all encoding formats, and the program can also automatically identify web page encodings.
2. Multiple publishing methods: Locomotive collector supports all current mainstream and non-mainstream CMS, BBS and other website programs. The perfect combination between the collector and website programs can be achieved through the system's publishing module.
3. Fully automatic: unattended work. After configuring the program, the program will run automatically according to your settings without manual intervention.
4. Local editing: Visually edit collected data locally.
5. Collection and testing: This is unmatched by any other similar collection software. The program supports direct viewing of collection results and testing and publishing.
6. Convenient management: Locomotive collector uses site + task mode to manage collection nodes. Tasks support batch operations, and it is easy to manage no matter how much data you have.
Locomotive collector features
truly universal
The locomotive collector collects data from any web page or content, and supports multiple extensions to break operational limitations. What you pick and how you pick it is up to you!
Efficient and stable
The distributed high-speed collection system of the locomotive collector allows multiple large-scale servers to operate stably at the same time, quickly decomposing the workload and maximizing efficiency.
High cost performance
High-performance products, combined with affordable prices, "saving costs and increasing value for customers" is the service concept of the train collector.
Accurate data
The locomotive collector has a built-in collection and monitoring system, which can report errors in real time and repair them in time; it ensures zero missing data when collecting and publishing, and presents the most accurate data to users.
Locomotive collector installation steps
1. Place the locomotive collector (Locoy) in the Huajun Software Park Download Spider) and unzip it to the current folder, click on the Train Collector version 9.21 installer .exe application to enter the license agreement interface, and then click Next.
2. Select the software installation location interface. Huajun editor recommends that users install it in the D drive, select the installation location and click Next.
3. The installation of Locoy Spider is completed, click Finish.
How to use the locomotive collector
1. In the main interface of the program, click the "New" drop-down arrow and select the "Task" item.
2. In the pop-up window, enter the "Task Name" and click the "Add" button on the right side of the "Start URL" column.
3. The next extremely important step is to divide the website to be collected into boards, conduct a comprehensive analysis of the URLs of each article in the collected website and find out the rules, and finally fill in the form as shown in the figure.
4. Then switch to the "Step 2: Collect Content Rules" tab, we need to divide the web content into boards. Taking "Sogou Browser" as an example, right-click the webpage to be analyzed and select the "Inspect Element" item from the pop-up menu.
5. In the "Development Mode" interface, click the "Select an element on the page to perspective" button, and then click the "Title" content. At this time, the label corresponding to the title can be displayed in the "Developer" window. This An example is "h2".
6. Next, in the "Collection Content Rules" interface, click the "Add" button to add the "Title" item, or directly double-click the "Title" item to modify it. In the pop-up interface, check "Take before and after" and set the front and back to "","".
7. Use the same method to add other rules for collecting content. Switch to the "Step 3: Publishing Content Settings" tab, check "Enable Method 2", and make settings as shown.
8. Finally, from the task list, check the content to be collected, click the "Start" button to collect the web content of the website according to the rules.
Frequently Asked Questions about Locomotive Collector
Question: How does the locomotive collector implement hierarchical content collection?
Answer: This is possible. You can add tags to the rules when obtaining the first-level pages, and then crawl the second-level pages in order to formulate rules for capturing content on the second-level pages.
This picture shows the methods and rules for adding tags to first-level pages.
Q: How does the locomotive collector filter and delete useless information?
Answer: We can delete it through the content replacement function.
Relatively advanced, use the replace function to filter and delete spam information, and you can also use the asterisk function to perform fuzzy deletion.
For example, we need to collect a batch of news content through collection rule settings. As a result, several software download addresses are mixed in the titles of these news content. At this time, we can use the filtering function to easily solve the problem.
We can open the editing interface of the title tag, select content filtering, and fill in download among the content that must not be included, so that all titles containing the word "download" in the title will be filtered out.
After that, we can delete the collected content we don’t want by selecting Delete for filtering in the detailed settings.
Question: How does the locomotive collector collect pictures?
1. Let’s take the collection of pictures from a certain mall as an example. First, copy the URL and open the website. Select a type of picture you want to collect. You can choose any type of picture you like below as the picture collection object.
2. Create a new task and edit the rules for collecting URLs.
3. You can see that there are 2421 pages of product pages. Due to time constraints, I only collect pictures from the first 5 pages. Add the top 5 starting web page URLs in Locomotive in batches:
4. Open the 5 starting web page URLs you just added, right-click and view the source code. Find the beginning and end of the product link in the source code, and determine the rules for collecting URLs. As shown below.
5. Save all the collection rules, test the collection, confirm that the collection URL is correct, and proceed to the next step.
6. Edit collection content rules. Because we are collecting pictures, we only need to edit the rules for collecting content.
7. The collection content rules are set as follows:
8. Check the download image and image saving path, and save.
Set up and save content, so you have to complete the settings and start collecting!
10. All collected pictures can be found in the [date] folder of the locomotive collector.
Comparison of similar software
Octopus data collection systemWith a completely independently developed distributed cloud computing platform as its core, it can easily obtain a large amount of standardized data from various websites or web pages in a short period of time, helping any customer who needs to obtain information from web pages to achieve automated data collection. Edit, standardize, and get rid of dependence on manual search and collection of data, thereby reducing the cost of obtaining information and improving efficiency. Easily obtain large amounts of normalized data from various websites or web pages.
Easy Map Data Collection MasterIt is a software that professionally collects data information such as mobile phones, landlines, addresses, and coordinates of Baidu Maps, 360 Maps, Amap, Sogou Maps, Tencent Maps, Tuba Maps, and Tiantu Maps of merchants, companies, and stores. It is similar to similar software. The most significant feature is that it is the most professional in collecting maps, the fastest in collecting speed, the most accurate in collecting, and the simplest in operation.
Locomotive collector supports collecting 99% of web pages, and the speed is 7 times that of ordinary collectors. Locoy Spider can also support remote downloading of image files and information collection after website login. Download and use it now! |
it works
it works
it works