Basic web scraper for [login to view URL]

Completado Publicado hace 8 meses Pagado a la entrega
Completado Pagado a la entrega

If you have previous experience with developing a script for a web scraper on [login to view URL] I would like to hear from you. Please state in your reply if you have worked with [login to view URL] before. I will only consider offer from developer that has previous experience developing against the Apify api and platform.

I need a basic web scraper that can scrape all pages on a WordPress website. It needs to find an element with a specific class or id and get the text content from that element. If that element is not found the grab the content of the body

As it will be used as part of an automated process it must entirely be operated using the CLI.

Additional requirements:

- Scrape the text content of pdf files.

- Report progress to a webhook at intervals

- Post the content in batches to a webhook

- Must be written in javascript or typescript

Specifications

===========

It takes 5 input parameters:

- starturl [required]

- contentidentifier (default: “”, type:string)

- maxpages (default: 10, zero = all pages, type:string)

- issitemap (default: false, type:string)

- batchsize (default: 25, type:integer)

crawling behavior

=============

-if the “issitemap” param is set to "true" only scrape the links on the sitemap. Otherwise, follow all links that point to the same domain as the starturl.

-Respect [login to view URL]

- If maxpages has been set to 0 we must crawl all pages. Otherwise only scrape the number of pages set by the maxpages parameter.

- I need the content of the element specified by the "contentidentifyer” parameter. For example, if

"ContentArea" is specified get the text inside the div/span that has that id or class

Suggested value format:

“[login to view URL]” e.g. “[login to view URL]”

“[login to view URL]” e.g. “[login to view URL]”

- Clean up the output and strip all HTML

Output format

============

I need the output as a JSON:

[{

“url”: “<fully qualified url1 : [login to view URL]>”,

“content”: $scrapedContent1

},{

“url”: “<[login to view URL]>”,

“content”: $scrapedContent2

}]

Webhook: content

=================

A webhook needs to be called when a page has been scrapped. Instead of calling the webhook every time a page has been scraped, the content must be sent in batches. The size of the batch is set by the “batchsize” input parameter.

Development of these webhooks are not part of this task.

Webhook: Progress report

====================

Progress reports containing statistics about the progress of the crawling process to be sent to a webhook.

This is my first scraper and am I not entirely sure about my options in this regard, but on the wish list of info I would like to receive is:

- Number of pages indexed / Total number of pages found.

- Event: CRAWLER_RUN_STARTED

- Event: CRAWLER_RUN_SUCCEEDED

- Event: CRAWLER_RUN_FAILED

- Event: CRAWLER_RUN_TIMED_OUT

- Event: CRAWLER_RUN_ABORTED

- The total cost of the task when the crawl is over.

Please note, that this is my first experience with running an actor and with Apify, and will be happy for any suggestions you might have.

Best regards

Tony

Extracción de datos web JavaScript Typescript

Nº del proyecto: #37124002

Sobre el proyecto

66 propuestas Proyecto remoto Activo hace 8 meses

Adjudicado a:

gabazza

Hi Tony, I hope this message finds you well. I'm reaching out in response to your project regarding the development of a web scraper using Apify.com. Your project requirements align perfectly with my experience and sk Más

$400 USD en 8 días
(6 comentarios)
3.1

66 freelancers están ofertando un promedio de $474 por este trabajo

AwaisChaudhry

Hi Good evening , How are you? I just saw your job posting . I see you have been looking for someone experience with these technologies Typescript, Web Scraping and JavaScript. I believe this is some thing I can help Más

$750 USD en 13 días
(78 comentarios)
8.3
shayona163

I understand you need a basic web scraper that can scrape all pages on a WordPress website. It needs to find an element with a specific class or id and get the text content from that element. If that element is not fou Más

$350 USD en 3 días
(91 comentarios)
6.4
AITSoft

Hi, I have read the brief details on the job listing. I am a full stack developer with 6 years of coding experience. I have worked with multiple similar jobs before. I have worked on similar jobs before specially with Más

$750 USD en 13 días
(10 comentarios)
6.1
abhisaini0188

Hello there! My name is Abhishek and I am a Full Stack Developer with over 12 years of experience in the tech industry. I specialize in the MEAN/MERN/LAMP (Laravel, Codeigniter, CakePHP) tech stack and have worked on Más

$400 USD en 7 días
(8 comentarios)
5.5
justmian876

Available!

$250 USD en 7 días
(9 comentarios)
5.6
c0909h1179

Hi there, I am the best here! Please check out my profile and see what others have to say about the work I've done related to the skills you're looking for. Hope to work together soon. Thanks!

$500 USD en 7 días
(10 comentarios)
5.1
LancerboyAshrak

Hi there! I am Md Ashrak, a highly skilled and experienced data entry, data collection, web scraping, Python scripts, lead generation, human translation, and WordPress specialist with over 10 years of experience. I und Más

$300 USD en 7 días
(26 comentarios)
5.1
nisthark

Hello, my name is Nisthar and I am a full stack web developer with 10 years of experience in software/web development. I specialize in nodejs, laravel, html/css, javascript, vue, react and more. I understand that you Más

$300 USD en 5 días
(18 comentarios)
5.6
NadMax

Hi Tony, I'm an experienced web scraping developer with 8+ years of hands-on experience in JavaScript, TypeScript, and Apify. I'm confident that I can assist you with developing the web scraper you need. To proceed f Más

$500 USD en 11 días
(8 comentarios)
5.0
anayapallavi

Hi tony, I can make web scraper for apify.com. I hope you will give me a chance to work on this project. Please initiate a message for further discussion.

$250 USD en 3 días
(21 comentarios)
4.9
chwaqas5434

Hello, I am Waqas, a web developer with 3 years of experience in the field. I understand you are looking for a web scraper that can scrape all pages on a WordPress website and get the text content from that element. If Más

$250 USD en 1 día
(25 comentarios)
4.9
Dataservicepoint

Hello, my name is Murad, and I am a qualified JavaScript developer with three years of freelance marketplace experience. I understand you need a basic web scraper that can scrape all pages on a WordPress website. It ne Más

$250 USD en 5 días
(13 comentarios)
4.8
Nehag510

Hello I am a top-rated plus full stack developer with over 12 years of experience in developing web applications, mobile applications, hybrid applications, web applications and cross-platform applications. I understan Más

$500 USD en 7 días
(5 comentarios)
3.8
arkcrew1

Hello, my name is Agha Saim and I am part of the Ark Crew team. We understand that you are looking for a web scraper that can scrape all pages on a WordPress website and get the text content from that element. If that Más

$500 USD en 7 días
(6 comentarios)
3.5
paul612

Hello, I'm Paul and I'm a JavaScript developer with 26 years of experience. I understand you need a web scraper that can scrape all pages on a WordPress website and get the text content from that element. If that eleme Más

$500 USD en 7 días
(4 comentarios)
3.3
manpreetkaur991

Hello there, my name is Manpreet and I'm a web developer with extensive experience in the field of software development. I noticed you are looking for a basic web scraper that can scrape all pages on a WordPress websit Más

$500 USD en 7 días
(4 comentarios)
4.1
vitaliipopkov

★★★ Hi Tony F.★★★ Going through your description, it seems like you might be looking for a senior web developer for your project - Basic web scraper for apify.com. As I have worked on similar projects previously, I am Más

$500 USD en 5 días
(6 comentarios)
2.9
kolisnichenkoiho

"Ihor K was very cooperative, listened to my feedback & succesfully finished the task I gave him. I will definitely hire him again for any new projects I will have." Dear Tony F. I'm thrilled to submit my applicati Más

$500 USD en 7 días
(1 comentario)
2.6
REPLATechnology

Dear Client, I am a full-stack developer with 6+ years of experience in developing web applications. I have a strong understanding of Apify and have developed several web scrapers using the platform. I am confident tha Más

$650 USD en 18 días
(5 comentarios)
1.6
AmrinNahar09

Hello there! My name is Mst Amrin Nahar and I am a JavaScript expert, freelancer with over 10 years of experience. I understand that you are looking for a basic web scraper that can scrape all pages on a WordPress webs Más

$250 USD en 1 día
(3 comentarios)
1.5