Find Jobs
Hire Freelancers

Simple web scraper/blog spider

$30-100 USD

Cancelado
Publicado hace más de 16 años

$30-100 USD

Pagado a la entrega
Here at the Miskatonic Machine Lab, we are currently embarking on a large-scale project involving semantic analysis of blog posts. To collect aggregate data, we will need to collect a large number of blog posts on various topics. We require a PHP script to collect this data. This script will need to: 1. Scrape Google Blog Search ([login to view URL]) results(sorted by date) for a specified keyword and get the URLs of the first 100 results. 2. Visit each search result and extract the post title, date published, and post body. It will need to successfully scrape many different blog platforms- it can either intelligently try to determine the post title, date and body, or hardcode in scraper settings for 20-30 of the most popular blog platforms indexed by Google Blog Search. 3. For each keyword, the 100 scraped posts should be stored in a MySQL database and then combined into an RSS feed. So all 100 posts for a particular keyword should be in a single RSS feed. This feed should be accessible via HTTP in the format [login to view URL] It should support keywords both with and without quotes. For performance, once a feed for a keyword is generated, it should be cached. The script should have a very simple admin interface where I can enter a keyword to generate a new feed, view already generated feeds, and delete keywords(and posts for that keyword) from the database. Because this script will be collecting large amounts of data, it needs to be as fast and efficient as possible. You can test your script with the keyword "my cat died.". For that keyword, you would be scraping the following search results URL: [login to view URL];ie=UTF-8&scoring=d&q=%22my+cat+died%22&btnG=Search+Blogs This data gatherer is critical to our research. Satisfactory completion of this project will lead to us assigning you much larger and more involved projects. We have a $300,000 federal grant to spend on this research, and once we find a quality programmer who can quickly complete a small project like this, we can assign more complex (and expensive!) work. Thanks in advance for your bids, Prof. John Gainsworth Machine Lab Miskatonic University
ID del proyecto: 191389

Información sobre el proyecto

7 propuestas
Proyecto remoto
Activo hace 16 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
7 freelancers están ofertando un promedio de $99 USD por este trabajo
Avatar del usuario
Please check PMB.
$100 USD en 3 días
4,8 (275 comentarios)
8,2
8,2
Avatar del usuario
Can be done
$100 USD en 3 días
5,0 (50 comentarios)
6,9
6,9
Avatar del usuario
Hello, Please look at the PMB. Thanks, Sergey
$100 USD en 10 días
5,0 (25 comentarios)
6,6
6,6
Avatar del usuario
Pls see PMB
$100 USD en 5 días
5,0 (136 comentarios)
6,1
6,1
Avatar del usuario
will use perl script to complete the task, pls let me know if you are interested.
$100 USD en 7 días
5,0 (162 comentarios)
6,3
6,3
Avatar del usuario
Hello. I have a fast spider on java and I can castomize that for you task. Please read pm.
$100 USD en 3 días
4,6 (4 comentarios)
3,5
3,5
Avatar del usuario
Pls see PMB
$90 USD en 2 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
Chicago, United States
0,0
0
Miembro desde nov 1, 2007

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.