

.. image:: img/banner/logo_3.png
    :class: dark-light

Welcome to Spatial Data Mining Lab 2025!
========================================


Welcome to the **Spatial Data Mining (ENGO 645/537)** lab! Whether you're new to spatial data or looking to sharpen your skills, this lab is designed to help you explore spatial data and apply it to real-world problems. 
Our focus is on making these concepts practical and approachable, with Python as our main tool.


What You’ll Learn
-----------------


In the ENGO 645/537 lab series, we aim to equip you with the skills and knowledge needed to work with **spatial data mining** using Python. Whether you’re new to the field or have some experience, this lab series will help you gain practical skills and confidence in solving real-world spatial data problems. Let’s break down what you can expect to achieve:

1. **Learn Python for Data Science and Spatial Applications**  
   From setting up your Python environment to understanding key libraries like `pandas`, `matplotlib`, and `GeoPandas`, you’ll develop essential programming skills. We’ll guide you step by step, starting with installing Python and getting your hands dirty with real data, such as community crime statistics maps. These tools will help you process, clean, and analyze datasets with ease.

2. **Learn the Art of Version Control with Git and GitHub**  
   Ever accidentally lose your project files or struggle with managing versions of your code? Version control will save you! You’ll learn how to manage your projects using Git and GitHub, create repositories, and collaborate efficiently. We’ll also introduce JupyterLab’s Git plugin and show you how to manage projects directly from your command line, so you’ll be version control-savvy by the end of the lab.

3. **Explore Geospatial Data Using Python**  
   Spatial data isn't just numbers—it's geographic! In this lab, you’ll learn to handle geospatial data using Python. You’ll work with spatial joins, relationships, and visualizations, making it easy to display and analyze geographic datasets. Think maps, regions, and data combined in exciting and meaningful ways!

4. **Build and Evaluate Machine Learning Classification and Colustering Models**  
   You’ll get introduced to **classification in machine learning**, from simple concepts to building a working **Decision Tree Classifier** in Python. You will also explore **clustering algorithms** to group data points and discover meaningful patterns in spatial data. You’ll use visualization techniques, such as plotting results on interactive maps, to gain insight into clusters and trends. 

5. **Develop Problem-Solving and Critical Thinking Skills**  
   Throughout these labs, you’ll encounter various tasks and challenges that require creative thinking and troubleshooting. These real-world problems will strengthen your ability to approach spatial data mining tasks with confidence.

6. **Gain Hands-On Experience with Real-World Data**  
   All assignments in this lab involve working with real datasets and real-world scenarios. Whether you're analyzing crime data or exploring customer purchase patterns, each lab is designed to help you apply what you've learned in practical and meaningful ways.

Each lab assignment builds on these goals, ensuring that by the end of the semester, you’ll have both the theoretical knowledge and practical skills to work with spatial data.

Ready to dive in? Let’s get started and have fun learning!



Lab Format: How It Works
------------------------

Each lab assignment is structured to build on the topics we cover in our lectures. Here’s what you can expect:

1. **Tutorials**:
   - We'll walk you through specific examples and problems during the tutorial sessions.
   
   - These sessions are designed to show you how to use Python for spatial data analysis step by step.

2. **Homework Assignments**:
   - After each tutorial, you’ll receive tasks related to what you learned.
  
   - These assignments include programming tasks, creating output figures, and answering questions to test your understanding.
   
   - You’ll submit your Python code, results, and written answers as part of your homework.


.. important::
    
    While discussing ideas and working alongside your peers is a great way to learn, it’s **essential** that the work you submit is entirely your own.
    (In short: don't copy/paste from other students).


.. admonition:: Teaching Details
    :class: hint
    
    - Our labs combine in-person and online experiences. You'll find detailed tutorials, resources, and assignments available online, but we'll also hold in-person lab sessions for hands-on guidance and interactive discussions.
    - Need help outside class? You can always post your questions or join discussions on the course's D2L platform.


Grading
-------

Your lab assignments will be graded on a scale from 0 to 20 points each, with four labs throughout the semester. Together, the labs contribute to a total of **100 points**.

Here’s how the grading works for each lab:

- **Score of 20**: Excellent work, meeting all requirements with minimal or no errors.
- **Score of 15-19**: Good work, fulfilling most requirements but with minor issues.
- **Score of 10-14**: Satisfactory, but significant improvements are needed in key areas.
- **Score below 10**: Incomplete or insufficient work that requires major improvement.


.. note::
    **Assignment Deadlines**: 
    Make sure to submit your assignments on time. Generally, assignments are due within two weeks (we'll let you know the exact date and time for each assignment). 
    If you need a bit more time, no worries! **Late submissions**:

    - **Within 24 hours**: 10% penalty 
  
    - **Within 48 hours**: 20% penalty 
  
    - **After 48 hours**: Sorry, no more submissions accepted, and you'll lose the marks for that assignment.
  
    **Tip**: Stay on top of your assignments to avoid last-minute stress!





Lab Timeline
------------

Our labs will run for 13 weeks, starting on **January 14, 2025**, and spanning the entire winter semester. New material will be posted bi-weekly, keeping our learning experience dynamic and engaging.

Here’s a preview of the topics we’ll cover:


+-------------+--------------------------------------------+
| Assignment  | Topic                                      |
+=======+==================================================+
| **0** | (kick-off) Installing Python + Setting up GitHub |
+-------+--------------------------------------------------+
| **1** | Data preparation with pandas and Visualization   |
+-------+--------------------------------------------------+
| **2** | Market Basket Analysis, Frequent Pattern Mining  |
+-------+--------------------------------------------------+
| **3** | Classification and Spatial Analysis              |
+-------+--------------------------------------------------+
| **4** | Spatial Clustering                               |
+-------+--------------------------------------------------+
| **5** | (Optional) Spatial-Temporal Trajectory Mining    |
+-------+--------------------------------------------------+

.. toctree::
    :maxdepth: 2
    :caption: Kick-off Assignment

    tutorials/L1/overview
    tutorials/L1/python_installation
    tutorials/L1/git_and_github
    tutorials/L1/prepration
    notebooks/L1/vs_code
    tutorials/L1/assignment-1

.. toctree::
    :maxdepth: 2
    :caption: Assignment 1

    tutorials/L2/overview
    tutorials/L2/libraries
    notebooks/L2/using_pandas
    notebooks/L2/plotting
    tutorials/L2/tasks

.. toctree::
    :maxdepth: 2
    :caption: Assignment 2

    notebooks/L3/association_rules
    notebooks/L3/geopandas
    tutorials/L3/tasks

.. toctree::
   :maxdepth: 2
   :caption: Assignment 3

   tutorials/L4/classification
   notebooks/L4/classification_with_python
   tutorials/L4/tasks

.. toctree::
    :maxdepth: 2
    :caption: Assignment 4

    tutorials/L5/clustering
    notebooks/L5/Clustering_Geospatial_Data
    tutorials/L5/tasks

.. toctree::
    :maxdepth: 2
    :caption: Bonus Assignment

    tutorials/L6/tasks