Iam currently scraping a site for job advert details, with the administrators
permission. I would like to run a process that can access href's on the site to
open other pages, until i get to the required location. I know that i can
retrieve the page through CFHTTP but I would like to be able to access the
different links in the page that I have retrieved. Is there a way of doing this
? As I am really scratching my head at the moment. Any help greatly appreciated.
cf_code_warrior - 29 Oct 2004 21:23 GMT
Not sure I follow ... grab the http from the primary source, parse for further
http references from the grabbed string, and in a loop continue grabbing
further pages? Must be mising something in your question? Robert
samb1 - 30 Oct 2004 13:29 GMT
I need to access the links on the page that I retrieve, so that I dont have to
enter the exact url in my code. Instead I can retrieve the home page and then
access the specific links on the page until I get to the location where I will
be doing my scraping. For example, If I want to get to
www.rspb.co.uk/vancacies/index.asp but I only retrieve www.rspb.co.uk, how do i
access the links on www.rspb.co.uk to get to www.rspb.co.uk/vancacies/index.asp
.