Find Jobs
Hire Freelancers

Some Spark and hive queries

₹1500-12500 INR

Cerrado
Publicado hace más de 1 año

₹1500-12500 INR

Pagado a la entrega
Spark Use Case (Movie Review Analysis) IMBD is an online database of movie-related information. IMBD users rate the movies and provide reviews. They rate the movies on a scale of 1 to 5; 1 being the worst and 5 being the best. The dataset also has additional Information, such as the release year of the movie. You have to analyze the data collected and answer the following questions. You need to find: 1) The total number of movies 2) The maximum rating of movies 3) The number of movies that have maximum rating 4) The movies with ratings 1 and 2 5) The list of years and number of movies released each year 6) The number of movies that have a runtime of two hours Steps to follow: 1. Create a table in RDBMS (MySql, MSsql, Oracle) and load the data in table (usign bulk insert). 2. Ingest the data using Sqoop to HDFS locaton 3. Create a Hive External Table 4. Read External Table using PySpark Session 5. Perform the Spark POC query and Save the file in Parquet data formate 6. After save the file again create a External table in hive and load the parquet data. 7. Optional Create a BI report using (Tablue, PowerBI and Kibana) Note I'm shareing the bulk inset query for your refernce (MSSQL) create table customers ( Customer_id int, Cust_name varchar(100), City varchar(20), Grade nvarchar(10), Salesman_id int ) BULK INSERT customers FROM 'C:\Users\Ramkrishna\Desktop\SQL\MYSQL\Qerry\[login to view URL]' --location with filename WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' ) GO Data File you will require for above can be downloaded from Myeclass in the Project Section named as: [login to view URL]
ID del proyecto: 35296358

Información sobre el proyecto

8 propuestas
Proyecto remoto
Activo hace 1 año

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
8 freelancers están ofertando un promedio de ₹15.238 INR por este trabajo
Avatar del usuario
Hello... I am interested
₹57.000 INR en 7 días
5,0 (131 comentarios)
6,5
6,5
Avatar del usuario
Hi, I'm an experienced data scientist with over 7 years of active development experience building Machine learning and AI systems using multiple tools and technologies including R, Python and PySpark. I hold a Masters degree in Data Science from Trinity College Dublin as well as a Bachelors degree in Computer Science. I have plenty of experience on big data, hadoop and it's ecosystem components specially sqoop, flume, oozie, hive, impala and currently working on all of them including pyspark. Feel free to check out my profile reviews. Cheers!
₹12.000 INR en 7 días
5,0 (50 comentarios)
6,0
6,0
Avatar del usuario
Hello, I have read your project description Spark and hive queries. I am an expert in database systems and I have developed many database systems including but not limited to a database system for a School, County Bursary, Pos System and Sacco system. I have used tools such as SQL Developer, SQL Plus, SQLYog, MySQL Workbench, SSMS etc. Am confident I can handle your project. Kindly open a discussion with me so that we talk the best way to work. If you award me the project I will work closely with you throughout the whole project life span, communicating continuously at every stage with updates until the project comes to fruition, in perfect condition, and ready for submission. Hope to work with you soon! Thanks Nyaronyari
₹9.900 INR en 5 días
5,0 (36 comentarios)
5,0
5,0
Avatar del usuario
PYSPARK EXPERT HERE!!! "Satisfy the client with my ability and passion" This is my slogan here. I hope you will be interested in me. Thanks.
₹10.000 INR en 3 días
5,0 (1 comentario)
1,8
1,8
Avatar del usuario
*Extensive experience in working with structured data using HiveOL, Join operations, optimizing Hive Queries * Experience in importing and exporting data using Sqoop from HDFS to Relational Database.
₹4.000 INR en 7 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hi I can Analyze and Visualize this data as per your Requirement. Also Provide you description of each step that will help you to understand the project.
₹6.000 INR en 7 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
I have 4+ years experience as Data Engineer. I have hands on experience on python, SQL, Hadoop, AWS services, and visualization tool as an power BI. I worked on different database and files like SQL, SAP HANA, parquet,CSV, excel , Dynamo DB etc. I worked on end to end projects.
₹11.000 INR en 7 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
I can do the work with the steps you mentioned and I can create a script python and spark and this will be very good and you can run it on the data at any time just change the location of the data file
₹12.000 INR en 7 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de INDIA
B 5 Block, India
0,0
0
Miembro desde ene 19, 2020

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.