Boilerplate
We need to install python, get the dependencies, set up a project structure and finally ensure everything is working. So to keep it simple, we just want to install our library and use it reliably. We test by defining and printing some default paths, cause why not.
- install git
- install uv
- init project
mkdir thesis && cd thesis
uv init --lib --verbose .
uv sync
# Activate virtual environment / Recognized automatically by VS Code
source .venv/bin/activate
echo "Using python version : $(cat .python-version)"
echo "Created pyproject.toml: $(cat pyproject.toml | head -n 1)"
echo "Created virtual env : $(ls ./.venv/bin/activate)"
- make note of python version used, use latest if needed
- install torch, pandas, matplotlib
- create data, config, outputs, experiments directories
- Replace content of
src/thesis/__init__.pywith structure we will use very often.
from pathlib import Path
class Paths:
source = Path(".")
data = Path("data")
config = Path("config")
experiments = Path("experiments")
outputs = Path("outputs")
- activate virtual env
- test that module is installed and default paths can be used
Another file that will be very useful later is src/thesis/utils/db.py:
import sqlite3
from pathlib import Path
from typing import Literal
import pandas as pd
def save_db(
df: pd.DataFrame,
db_path: Path | str,
table: str,
if_exists: Literal["fail", "replace", "append"] = "replace",
index: bool = False,
verbose: bool = False,
) -> None:
"""Save DataFrame to sqlite3 db as specified table.
NOTE : default behaviour is to replace the table if it exists."""
conn = sqlite3.connect(str(db_path))
if len(df.columns) == 0:
print(f"Cannot save empty df ->\n{df}")
return
df.to_sql(table, conn, if_exists=if_exists, index=index)
if verbose:
print(df)
print(df.columns)
print(f"Saved to => {table} => {db_path}")
def load_db(
db_path: Path | str,
table: str,
verbose: bool = False,
) -> pd.DataFrame:
"""Read sqlite3 db, return all rows from specified table."""
conn = sqlite3.connect(str(db_path))
df = pd.read_sql_query(f"SELECT * from '{table}'", conn)
if verbose:
print(df)
print(df.columns)
return df