DiigIT | IT Community
No Profile Image
Welcome Guest
New User? Register | Login

crawler

By: Admin | 10 Sep 2008 4:29 pm

A crawler extracts all links of the site untill no new links are found. It keeps the links either in the database or in a file. By
comparing the links and the 'keyword' of search the search engine prints the links.

For example, if the link contains, 'ABC' and the search key is 'ABC' then that link will be printed.

I have a members page which lists all the members of the site. It contains the name, age, SPAM, etc of the members. Since it contains
more than 1000 members i am listing 10 members per page. So the link will be <a href='members. php?page_ id=1'> Next </a>. The page_id will
be 2..3... and goes on.

If a member whose name 'ABC' will be listed in page 10 of the members page. i.e <a href='members. php?page_ id=10'> will contain member 'ABC'
details, and i give a search keyword 'ABC' how the crawler will get 'ABC' when it crawls the members page.

Comments

While the crawler is reading the page looking for any links (href=) it is also looking for your keyword IE (if (($keyat=stripos( str_tolower( $line),str_ tolower($ key))!==FALSE) &&((($grda=strpos( $line,">" ,$keyat)= ==FALSE) || ((strpos(substr( $line,$keyat, $grda-$keyat) ===FALSE) )) { found key}
IE find the key value on the line and make sure it is not followed by a '>'
By: Admin | 10 Sep 2008

Leave a comment

Enter the text in the image
img
Can't read?
Type the characters you see in the picture below.


Close Move