-
Notifications
You must be signed in to change notification settings - Fork 5
Storage: Raw Storage
**If you haven't grab a local copy of our Examples, click here to learn how.
The following will be based off
ai.preferred.crawler.stackoverflow.master
in our Examples package.
To get started on using FileManager, we need to do a few basic steps:
- Create a Fetcher with FileManager (ListingCrawler.java)
- Initialise MysqlFileManager (ListingCrawler.java)
Instead of an empty builder, we have to set the FileManager we would like to use when building the Fetcher. Let's change createFetcher() a little to take in an additional parameter caleld fileManager, like this:
private static Fetcher createFetcher(FileManager fileManager) {
// You can look in builder the different things you can add
return AsyncFetcher.builder()
.setFileManager(fileManager)
.build();
}
Define all the configuration for MySQL and the storage directory.
// Define config for MysqlFileManager
final String host = "localhost";
final int port = 3306;
final String database = "filemanager";
final String table = "responses";
final String user = "root";
final String password = "";
final String dir = "C:\\data";
final String mysqlLocation = "jdbc:mysql://" + host + ":" + port + "/" + database;
**It is important that both the database and storage directory exists!! The MySQL table will be created automatically by the FileManager.
Now we need to initialise MysqlFileManager in our main method, right before the Crawler, then pass fileManager as a parameter to createFetcher().
try (final FileManager fileManager = new MysqlFileManager(mysqlLocation, table, user, password, dir);
final Crawler crawler = createCrawler(createFetcher(fileManager), session).start()) {
...
}
That's all we need to do to get MysqlFileManager working.
Now run the crawler and watch C:\data and MySQL get populated with html files and records respectively!
Venom (c) Your preferred open source focused crawler for the deep web
Blazing fast | Customizable | Robust | Simple and Handy