Skip to content

Boilerplate

We need to install python, get the dependencies, set up a project structure and finally ensure everything is working. So to keep it simple, we just want to install our library and use it reliably. We test by defining and printing some default paths, cause why not.


  • install git
  • install uv
sudo apt install git curl
curl -LsSf https://astral.sh/uv/install.sh | sh

  • init project
mkdir thesis && cd thesis
uv init --lib --verbose .
uv sync

# Activate virtual environment / Recognized automatically by VS Code
source .venv/bin/activate

echo "Using python version  : $(cat .python-version)"
echo "Created pyproject.toml: $(cat pyproject.toml | head -n 1)"
echo "Created virtual env   : $(ls ./.venv/bin/activate)"
  • make note of python version used, use latest if needed
python -c "import sys; print(sys.version)"

  • install torch, pandas, matplotlib
uv add torch torchvision pandas matplotlib

  • create data, config, outputs, experiments directories
mkdir data config outputs experiments
  • Replace content of src/thesis/__init__.py with structure we will use very often.
from pathlib import Path

class Paths:
    source = Path(".")
    data = Path("data")
    config = Path("config")
    experiments = Path("experiments")
    outputs = Path("outputs")

  • activate virtual env
source .venv/bin/activate
  • test that module is installed and default paths can be used
python -c "import thesis; print(f'{thesis.Paths.data = }')"
# thesis.Paths.data = PosixPath('data')

Another file that will be very useful later is src/thesis/utils/db.py:

import sqlite3
from pathlib import Path
from typing import Literal

import pandas as pd


def save_db(
    df: pd.DataFrame,
    db_path: Path | str,
    table: str,
    if_exists: Literal["fail", "replace", "append"] = "replace",
    index: bool = False,
    verbose: bool = False,
) -> None:
    """Save DataFrame to sqlite3 db as specified table.
    NOTE : default behaviour is to replace the table if it exists."""
    conn = sqlite3.connect(str(db_path))

    if len(df.columns) == 0:
        print(f"Cannot save empty df ->\n{df}")
        return

    df.to_sql(table, conn, if_exists=if_exists, index=index)
    if verbose:
        print(df)
        print(df.columns)
        print(f"Saved to => {table} => {db_path}")


def load_db(
    db_path: Path | str,
    table: str,
    verbose: bool = False,
) -> pd.DataFrame:
    """Read sqlite3 db, return all rows from specified table."""
    conn = sqlite3.connect(str(db_path))
    df = pd.read_sql_query(f"SELECT * from '{table}'", conn)
    if verbose:
        print(df)
        print(df.columns)
    return df