Determining Crawler Authentication |
||||
Philip says that it is important to determine who crawls the data, and how does the crawler authenticate. A secure crawler generally has to run in some sort of superuser
mode. The crawler needs read access to all of the documents to be indexed.
This can be achieved in one of two ways:
These two methods are essentially very similar a password proves that the SES instance is authorized. The main difference is that in the first method, there is no need for the data source to know that it is being crawled by SES.
|