メインコンテンツまでスキップ

How to setup SFTP in EC2 allowing file upload

· 約2分
Mikyan
白い柴犬

SFTP use SSH protocal, allowing user to transfer files safely. It is a subsystem of SSH, so it runs in port 22.

The name contains FTP, however it is not implements FTP protocal

  • It achieves the same function with FTP
  • It is widely supported by FTP clients

When you want to transfer files with server, it might be a good choice.

How to setup a sftp user allowing it to upload files into specific folder

The following scripts helps setup a SFTP user in

Save it to .sh file, execute the following commands

chmod +x setup_sftp_user.sh
sudo ./setup_sftp_user.sh vendor

It will prompt you to set the password and configure the rest automatically.

#!/bin/bash

# Check if running as root
if [ "$EUID" -ne 0 ]; then
echo "Please run as root or use sudo"
exit 1
fi

# Check if username is provided
if [ -z "$1" ]; then
echo "Usage: $0 <username>"
exit 1
fi

USER=$1

# Create user
useradd -m $USER
passwd $USER

# Setup SFTP directories
mkdir -p /var/sftp/$USER/uploads
chown root:root /var/sftp/$USER
chmod 755 /var/sftp/$USER

chown $USER:$USER /var/sftp/$USER/uploads
chmod 755 /var/sftp/$USER/uploads

# Backup sshd_config
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.bak

# Add SSH config for user
cat <<EOL >> /etc/ssh/sshd_config

Match User $USER
ChrootDirectory /var/sftp/$USER
ForceCommand internal-sftp
AllowTcpForwarding no
X11Forwarding no
EOL

# Restart SSH daemon
systemctl restart sshd

# Confirm success
echo "SFTP user '$USER' has been set up successfully."

Best Practice

· 約1分
Mikyan
白い柴犬

To simplify & unify the system logic, the following convention are considered as best practice:

  • Store everything in UTC
  • Use UTC everywhere in backend
  • Use UTC, with ISO-8601 format in API layer
  • Frontend: Convert to user's local time

How to do one thing well

· 約1分
Mikyan
白い柴犬

There are many situations, that we work on one thing that not familiar with. Or there is no known best practices. But luckily we can still have framework/principles to tackle them scientifically.

To do problems solving there are 4 basic elements:

  • Solving the real problem
  • Build a Causal Model for knowledge
  • Believe in the principle/best practice
  • Get feedback and iterate the knowledge

Science

Science is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.

Theory Facts

7つの習慣ー主体的である

· 約1分
Mikyan
白い柴犬

7つの習慣の一番目の習慣は主体的であるです。

毎日起こることと反応の間で選択の自由を意識し、常に

我々の人生では毎日起こることがあります、その起こることに対して、私たちの反応があります。 その起こることと反応の間で、我々が選択する事由がある。 その選択する自由を意識し、

Python File System Operations

· 約3分
Mikyan
白い柴犬
  • Use pathlib to handle file paths and operations by default
  • For finegraind read/write over file I/O (streaming), use context manager
  • Use the tempfile Module for Temporary Files/Directories
  • Use shutil for High-level Operations

Details

Use pathlib is the modern, oo way to handle file paths, and operations.

from pathlib import Path

# Create Path Objects:
my_file = Path("data")

# Use Path methods
my_file.exists()

# Content I/O

my_file.read_text()
#Path.write_text()
#Path.write_text()
#Path.write_bytes()

Create a Path Object

# From CWD
#
current_dir = Path.cwd()
# From Home
home_dir = Path.home()
# From absolute Paths
abs_path = Path("/usr/local/bin/python")
# From relative paths (relative to CWD)
relative_path = Path("data/input.csv")

# Create path by manipulation
base_dir = Path('/opt/my_app')
config_file = base_dire / "config" / "settings.yaml"

parent_dir = config_file.parent

Dealing with file name

# Get file / directory name
config_file.name

# Getting Stem
config_file.stem # settings

# Getting suffix
config_file.suffix
config_file.suffixes

# Get absolute path
config_file.resolve()
# or
config_file.absolute()

# Get relative path
relative_to = config_file.relative_to(project_root)

Check / Query File System

my_file.exists()

my_file.is_file()
my_file.is_dir()
my_file.is_symlink()

# Statistics
stats = temp_file.stat()

Operations

# Create directories
new_dir.mkdir()

# create empty file
empty_file.touch()

# delete file

file_to_delete.unlink()


# delete empty directories

empty_folder.rmdir()

# rename / move file or directories

old_path.rename(new_path)

# Changing suffix
config_file.with_suffix('.yml')

File Content I/O

config_path = Path("config.txt")
config_path.write_text("debug=True\nlog_level=INFO")
content = config_path.read_text()

binary_data_file = Path("binary_data.bin")
binary_data_file.write_bytes(b'\x01\x02\x03\x04')
data = binary_data_file.read_bytes()
print(f"Binary data: {data}")

directory iteration / traversal

# List
project_root.iterdir()

# Globbing
project_root.glob("*.py")

# Walking Directory Tree (Python 3.12+)
project_root.walk()

Use Context Managers (with open(...)) for File I/O

When you need more fine-grained control over file reading/writing, (streaming large files, specific encoding, or binary modes), use the with statement.

try:
with open("my_large_file.csv", "w", encoding="utf-8") as f:
f.write("Header1,Header2\n")
for i in range(1000):
f.write(f"data_{i},value_{i}\n")
except IOError as e:
print(f"Error writing file: {e}")

Use the tempfile Module for Temporary Files/Directories

import tempfile
from pathlib import Path

# Using a temporary directory
with tempfile.TemporaryDirectory() as tmp_dir_str:
tmp_dir = Path(tmp_dir_str)
temp_file = tmp_dir / "temp_report.txt"
temp_file.write_text("Ephemeral data.")
print(f"Created temporary file at: {temp_file}")
# At the end of the 'with' block, tmp_dir_str and its contents are deleted
print("Temporary directory removed.")

Use shutil for High-level Operations

shutil Focuses on operations that involing moving, copying, or deleting entire trees of files and directories, or other utility functions that go beyond a single Path obejct's scope.

import shutil

source_dir = Path("my_data")
destination_dir = Path("backup_data")


try:
shutil.copytree(source_dir, destination_dir)
print(f"Copied '{source_dir}' to '{destination_dir}'")
except FileExistsError:
print(f"Destination '{destination_dir}' already exists. Skipping copy.")
except Exception as e:
print(f"Error copying tree: {e}")
import shutil
from pathlib import Path

dir_to_delete = Path("backup_data") # Assuming this exists from the copytree example

if dir_to_delete.exists():
print(f"Deleting '{dir_to_delete}'...")
shutil.rmtree(dir_to_delete)
print("Directory deleted.")
else:
print(f"Directory '{dir_to_delete}' does not exist.")

Zip / Tarring

shutil even can create compressed archieves, and unpack them.

archive_path = shutil.make_archive(archive_name, 'zip', source_dir)
print(f"Created archive: {archive_path}")

Copy File Metadata

  • shutil.copystat(src, dst) copy permission bits, last access time, last modification time and flags from one file to another
  • shutil.copy2(src, dst) copies the file and metadata

Getting Disk Usage

usage = shutil.disk_usage(Path(".")) # Check current directory's disk
print(f"Total: {usage.total / (1024**3):.2f} GB")
print(f"Used: {usage.used / (1024**3):.2f} GB")
print(f"Free: {usage.free / (1024**3):.2f} GB")

Do not

  • Avoid os.system() or subprocess.run() for file operations in most case

Python Async Programming

· 約2分
Mikyan
白い柴犬

Python's asynchronous programming is built around the asyncio module, and async/await keywords.

Concept

coroutine is a special type of function that represents a computation that can be paused and resumed.

A coroutine is defined with async def.

For example the following function is a coroutine

async def my_coroutine():
print("Coroutine started")
await asyncio.sleep(1) # This is a pause point
print("Coroutine resumed after 1 second")
return "Done!"
  • Inside an async def function, the await keyword is used to pause the execution of the current coroutine.
  • When a coroutine awaits something, it singals to the event loop that it's waiting for an I/O operation or some other asynchronous event to complete
  • While the current coroutine is paused, the event loop can switch its attention to other coroutines or tasks that are ready to run, ensuring efficient use of the CPU.

Why async def functions can be paused

  • A regular def function is executed directly by the Python interpreter, when you call it the interpreter's program counter moves through its instructions sequentially. If it encounters something that blocks, the entire thread stops until that blocking operation is done.

  • An Async def function, when called doesn't immediately execute its body. Instead it returns a coroutine object. This object is a special kind of generator that the asyncio event loop knows how to manage.

  • use the await keyword to singal an intentional pause.

  • if there is no await inside an async def function, it will run like regular synchronous function until completion.

The Event loop is the orchestrator.

  • The asyncio event loop is continuously monitoring a set of registered coroutines/tasks. It's like a dispatcher.

  • State Preservation: (Generators)

Conceptually, Python coroutines are built on top of generators. When a generator yields a value, its local state (variables, instruction pointer) is saved. When next() is called on it again, it resumes from where it left off.

Similarly, when an async def function awaits, its internal state is saved. When the awaited operation completes, the coroutine is "sent" a signal to resume, and it continues execution from the line immediately following the await.

Why Async is important for web framework

Python Pydantic

· 約3分
Mikyan
白い柴犬

Pydantic can extend standard Python classes to provide robust data handling features. BaseModel is the fundamental class in Pydantic. By inheriting from BaseModel, Python classes become Pydantic models, gaining capabilities for:

  • Data Validation: Automatically check the types and values of class attributes against your defined type hints. It raises a ValidationError with clear, informative messages if incoming data doesn't confirm.
  • Data Coercion: Pydantic can intelligently convert input data to the expected type where appropriate.
  • Instantiation: Creates instances of your model by passing keyword arguments or a dictionary to the constructor

Details

Inheriting from BaseModel, Python classes become Pydantic models

You can use Field function, or annotation to add more specific constraints and metadata to your fields.



from typing import Optional
from pydantic import BaseModel, Field, EmailStr

class User(BaseModel):
name: str
age: int
email: str

# Valid data
user = User(name="Alice", age=30, email="alice@example.com")
print(user)

# Invalid data will raise a ValidationError
try:
User(name="Bob", age="twenty", email="bob@invalid")
except Exception as e:
print(e)


class Product(BaseModel):
id: int = Field(..., gt=0, description="Unique product identifier")
name: str = Field(..., min_length=2, max_length=100)
price: float = Field(..., gt=0.0)
description: Optional[str] = None # Optional field
seller_email: EmailStr # Pydantic's built-in email validation

product = Product(id=1, name="Laptop", price=1200.50, seller_email="seller@store.com")
print(product)

Create Pydantic Model

Directly use Constructor with unpacked dictionary, or model_validate do validate and convert dict to model.

model_validate_json do validate and convert JSON string to a model.

user_data = {
"name": "Alice",
"age": 30,
"email": "alice@example.com"
}
user_model = User(**user_data)

user_model = User.model_validate(user_data)


class Movie(BaseModel):
title: str
year: int
director: str
genres: list[str]

# Your JSON string data
json_string = '''
{
"title": "Inception",
"year": 2010,
"director": "Christopher Nolan",
"genres": ["Sci-Fi", "Action", "Thriller"]
}
'''
movie_model = Movie.model_validate_json(json_string)

Validate dictionary and JSON string: model_validate(), model_validate_json()

model_validate: validate a Python dictionary model_validate_json: validate a JSON string

from pydantic import BaseModel
import json

class Item(BaseModel):
name: str
quantity: int

data_dict = {"name": "Apple", "quantity": 5}
item1 = Item.model_validate(data_dict)
print(item1)

json_data = '{"name": "Banana", "quantity": 10}'
item2 = Item.model_validate_json(json_data)
print(item2)

Serialization: model_dump(), model_dump_json().

model_dump: to Python dictionary model_dump_json: to JSON

from pydantic import BaseModel

class City(BaseModel):
name: str
population: int

tokyo = City(name="Tokyo", population=14000000)
print(tokyo.model_dump())
print(tokyo.model_dump_json(indent=2)) # Pretty print JSON

Custom Validators: @field_validator, @model_validator

from datetime import date
from pydantic import BaseModel, ValidationError, field_validator, model_validator

class Event(BaseModel):
name: str
start_date: date
end_date: date

@field_validator('name')
@classmethod
def check_name_is_not_empty(cls, v):
if not v.strip():
raise ValueError('Event name cannot be empty')
return v

@model_validator(mode='after') # 'after' means after field validation
def check_dates_order(self):
if self.start_date > self.end_date:
raise ValueError('Start date must be before end date')
return self

try:
event1 = Event(name="Conference", start_date="2025-07-20", end_date="2025-07-22")
print(event1)
except ValidationError as e:
print(e)

try:
Event(name="Bad Event", start_date="2025-07-25", end_date="2025-07-23")
except ValidationError as e:
print(e)

Nested Models

from pydantic import BaseModel
from typing import List

class Address(BaseModel):
street: str
city: str
zip_code: str

class Customer(BaseModel):
customer_id: int
name: str
shipping_addresses: List[Address]

customer_data = {
"customer_id": 123,
"name": "Jane Doe",
"shipping_addresses": [
{"street": "123 Main St", "city": "Anytown", "zip_code": "12345"},
{"street": "456 Oak Ave", "city": "Otherville", "zip_code": "67890"}
]
}

customer = Customer.model_validate(customer_data)
print(customer)

JSON Schema Generation

from pydantic import BaseModel

class Task(BaseModel):
id: int
title: str
completed: bool = False

print(Task.model_json_schema(indent=2))

References

document on how to use it.

Python Type Hint

· 約2分
Mikyan
白い柴犬

Since Python 3.5, Python introduced Type hint. And it become more and more powerful.

With it You can set the type for your variable for readibility.

Type hints are hints, not enforcements. Python still runs the code even if types don't match.

Usage

# Primitives
name: str = "Tom"
age: int = 30
salary: float = 500.5
is_active: bool = True

# Collections
numbers: list = [1,2,3]
scores: tuple = (90, 85, 88)
unique: set = {1, 2, 3}
data: dict = {"key": "value"}


# Specific Collection Types

from typing import List, Dict, Tuple, Set

names: List[str] = ["Alice", "Bob", "Charlie"]
user: Dict[str, str] = {
"name": "John",
"email": "john@example.com"
}
person: Tuple[str, int, bool] = ("Alice", 30, True)
unique_ids: Set[int] = {1, 2, 3, 4, 5}

# after python 3.9 the following are also work
names: list[str] = ["Alice", "Bob", "Charlie"]
user: dict[str, str] = {
"name": "John",
"email": "john@example.com"
}person: tuple[str, int, bool] = ("Alice", 30, True)
unique_ids: set[int] = {1, 2, 3, 4, 5}

# Optional

from typing import Optional

# can be string or None
middle_name: Optional[str] = None

# Union
from typing import Union
number: Union[int, float] = 10
number = 10.5


# Literal for exact values
from typing import Literal
Status = Literal["pending", "approved", "rejected"]

def process_order(status: Status) -> None:
pass

# TypedDict
from typing import TypedDict
# TypedDict for dictionary structures
class UserDict(TypedDict):
name: str
age: int
email: str


# Class
user: User = get_user(123)

# method
def calculate_bmi(weight: float, height: float) -> float:
return weight / (height ** 2)

# Self
from typing import Self

class User:
def copy(self) -> Self: # Returns same class type
return User()