How to setup SFTP in EC2 allowing file upload

2025年1月5日 · 約2分

Mikyan

白い柴犬

SFTP use SSH protocal, allowing user to transfer files safely. It is a subsystem of SSH, so it runs in port 22.

The name contains FTP, however it is not implements FTP protocal

It achieves the same function with FTP
It is widely supported by FTP clients

When you want to transfer files with server, it might be a good choice.

How to setup a sftp user allowing it to upload files into specific folder

The following scripts helps setup a SFTP user in

Save it to .sh file, execute the following commands

chmod +x setup_sftp_user.sh
sudo ./setup_sftp_user.sh vendor

It will prompt you to set the password and configure the rest automatically.

#!/bin/bash

# Check if running as root
if [ "$EUID" -ne 0 ]; then
  echo "Please run as root or use sudo"
  exit 1
fi

# Check if username is provided
if [ -z "$1" ]; then
  echo "Usage: $0 <username>"
  exit 1
fi

USER=$1

# Create user
useradd -m $USER
passwd $USER

# Setup SFTP directories
mkdir -p /var/sftp/$USER/uploads
chown root:root /var/sftp/$USER
chmod 755 /var/sftp/$USER

chown $USER:$USER /var/sftp/$USER/uploads
chmod 755 /var/sftp/$USER/uploads

# Backup sshd_config
cp /etc/ssh/sshd_config /etc/ssh/sshd_config.bak

# Add SSH config for user
cat <<EOL >> /etc/ssh/sshd_config

Match User $USER
    ChrootDirectory /var/sftp/$USER
    ForceCommand internal-sftp
    AllowTcpForwarding no
    X11Forwarding no
EOL

# Restart SSH daemon
systemctl restart sshd

# Confirm success
echo "SFTP user '$USER' has been set up successfully."

Best Practice

2025年1月3日 · 約1分

Mikyan

白い柴犬

To simplify & unify the system logic, the following convention are considered as best practice:

Store everything in UTC
Use UTC everywhere in backend
Use UTC, with ISO-8601 format in API layer
Frontend: Convert to user's local time

How to do one thing well

2024年5月11日 · 約1分

Mikyan

白い柴犬

There are many situations, that we work on one thing that not familiar with. Or there is no known best practices. But luckily we can still have framework/principles to tackle them scientifically.

To do problems solving there are 4 basic elements:

Solving the real problem
Build a Causal Model for knowledge
Believe in the principle/best practice
Get feedback and iterate the knowledge

Science

Science is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.

Theory Facts

7つの習慣ー主体的である

2024年5月1日 · 約1分

Mikyan

白い柴犬

7つの習慣の一番目の習慣は主体的であるです。

毎日起こることと反応の間で選択の自由を意識し、常に

我々の人生では毎日起こることがあります、その起こることに対して、私たちの反応があります。その起こることと反応の間で、我々が選択する事由がある。その選択する自由を意識し、

Python File System Operations

2024年4月1日 · 約3分

Mikyan

白い柴犬

Use pathlib to handle file paths and operations by default
For finegraind read/write over file I/O (streaming), use context manager
Use the tempfile Module for Temporary Files/Directories
Use shutil for High-level Operations

Details

Use `pathlib` is the modern, oo way to handle file paths, and operations.

from pathlib import Path

# Create Path Objects:
my_file = Path("data")

# Use Path methods
my_file.exists()

# Content I/O

my_file.read_text()
#Path.write_text()
#Path.write_text()
#Path.write_bytes()

Create a Path Object

# From CWD
# 
current_dir = Path.cwd()
# From Home
home_dir = Path.home()
# From absolute Paths
abs_path = Path("/usr/local/bin/python")
# From relative paths (relative to CWD)
relative_path = Path("data/input.csv")

# Create path by manipulation
base_dir = Path('/opt/my_app')
config_file = base_dire / "config" / "settings.yaml"

parent_dir = config_file.parent

Dealing with file name

# Get file / directory name
config_file.name

# Getting Stem
config_file.stem # settings

# Getting suffix
config_file.suffix
config_file.suffixes

# Get absolute path
config_file.resolve()
# or
config_file.absolute()

# Get relative path
relative_to = config_file.relative_to(project_root)

Check / Query File System

my_file.exists()

my_file.is_file()
my_file.is_dir()
my_file.is_symlink()

# Statistics
stats = temp_file.stat()

Operations

# Create directories
new_dir.mkdir()

# create empty file
empty_file.touch()

# delete file

file_to_delete.unlink()

# delete empty directories

empty_folder.rmdir()

# rename / move file or directories

old_path.rename(new_path)

# Changing suffix
config_file.with_suffix('.yml')

File Content I/O

config_path = Path("config.txt")
config_path.write_text("debug=True\nlog_level=INFO")
content = config_path.read_text()

binary_data_file = Path("binary_data.bin")
binary_data_file.write_bytes(b'\x01\x02\x03\x04')
data = binary_data_file.read_bytes()
print(f"Binary data: {data}")

directory iteration / traversal

# List
project_root.iterdir()

# Globbing
project_root.glob("*.py")

# Walking Directory Tree (Python 3.12+)
project_root.walk()

Use Context Managers (with open(...)) for File I/O

When you need more fine-grained control over file reading/writing, (streaming large files, specific encoding, or binary modes), use the with statement.

try:
    with open("my_large_file.csv", "w", encoding="utf-8") as f:
        f.write("Header1,Header2\n")
        for i in range(1000):
            f.write(f"data_{i},value_{i}\n")
except IOError as e:
    print(f"Error writing file: {e}")

Use the `tempfile` Module for Temporary Files/Directories

import tempfile
from pathlib import Path

# Using a temporary directory
with tempfile.TemporaryDirectory() as tmp_dir_str:
    tmp_dir = Path(tmp_dir_str)
    temp_file = tmp_dir / "temp_report.txt"
    temp_file.write_text("Ephemeral data.")
    print(f"Created temporary file at: {temp_file}")
    # At the end of the 'with' block, tmp_dir_str and its contents are deleted
print("Temporary directory removed.")

Use shutil for High-level Operations

shutil Focuses on operations that involing moving, copying, or deleting entire trees of files and directories, or other utility functions that go beyond a single Path obejct's scope.

import shutil

source_dir = Path("my_data")
destination_dir = Path("backup_data")


try:
    shutil.copytree(source_dir, destination_dir)
    print(f"Copied '{source_dir}' to '{destination_dir}'")
except FileExistsError:
    print(f"Destination '{destination_dir}' already exists. Skipping copy.")
except Exception as e:
    print(f"Error copying tree: {e}")

import shutil
from pathlib import Path

dir_to_delete = Path("backup_data") # Assuming this exists from the copytree example

if dir_to_delete.exists():
    print(f"Deleting '{dir_to_delete}'...")
    shutil.rmtree(dir_to_delete)
    print("Directory deleted.")
else:
    print(f"Directory '{dir_to_delete}' does not exist.")

Zip / Tarring

shutil even can create compressed archieves, and unpack them.

archive_path = shutil.make_archive(archive_name, 'zip', source_dir)
print(f"Created archive: {archive_path}")

Copy File Metadata

shutil.copystat(src, dst) copy permission bits, last access time, last modification time and flags from one file to another
shutil.copy2(src, dst) copies the file and metadata

Getting Disk Usage

usage = shutil.disk_usage(Path(".")) # Check current directory's disk
print(f"Total: {usage.total / (1024**3):.2f} GB")
print(f"Used: {usage.used / (1024**3):.2f} GB")
print(f"Free: {usage.free / (1024**3):.2f} GB")

Do not

Avoid os.system() or subprocess.run() for file operations in most case

Python Async Programming

2024年3月11日 · 約2分

Mikyan

白い柴犬

Python's asynchronous programming is built around the asyncio module, and async/await keywords.

Concept

coroutine is a special type of function that represents a computation that can be paused and resumed.

A coroutine is defined with async def.

For example the following function is a coroutine

async def my_coroutine():
    print("Coroutine started")
    await asyncio.sleep(1) # This is a pause point
    print("Coroutine resumed after 1 second")
    return "Done!"

Inside an async def function, the await keyword is used to pause the execution of the current coroutine.
When a coroutine awaits something, it singals to the event loop that it's waiting for an I/O operation or some other asynchronous event to complete
While the current coroutine is paused, the event loop can switch its attention to other coroutines or tasks that are ready to run, ensuring efficient use of the CPU.

Why async def functions can be paused

A regular def function is executed directly by the Python interpreter, when you call it the interpreter's program counter moves through its instructions sequentially. If it encounters something that blocks, the entire thread stops until that blocking operation is done.
An Async def function, when called doesn't immediately execute its body. Instead it returns a coroutine object. This object is a special kind of generator that the asyncio event loop knows how to manage.
use the await keyword to singal an intentional pause.
if there is no await inside an async def function, it will run like regular synchronous function until completion.

The Event loop is the orchestrator.

The asyncio event loop is continuously monitoring a set of registered coroutines/tasks. It's like a dispatcher.
State Preservation: (Generators)

Conceptually, Python coroutines are built on top of generators. When a generator yields a value, its local state (variables, instruction pointer) is saved. When next() is called on it again, it resumes from where it left off.

Similarly, when an async def function awaits, its internal state is saved. When the awaited operation completes, the coroutine is "sent" a signal to resume, and it continues execution from the line immediately following the await.

Why Async is important for web framework

Python Pydantic

2024年3月10日 · 約3分

Mikyan

白い柴犬

Pydantic can extend standard Python classes to provide robust data handling features. BaseModel is the fundamental class in Pydantic. By inheriting from BaseModel, Python classes become Pydantic models, gaining capabilities for:

Data Validation: Automatically check the types and values of class attributes against your defined type hints. It raises a ValidationError with clear, informative messages if incoming data doesn't confirm.
Data Coercion: Pydantic can intelligently convert input data to the expected type where appropriate.
Instantiation: Creates instances of your model by passing keyword arguments or a dictionary to the constructor

Details

Inheriting from BaseModel, Python classes become Pydantic models

You can use Field function, or annotation to add more specific constraints and metadata to your fields.

from typing import Optional
from pydantic import BaseModel, Field, EmailStr

class User(BaseModel):
    name: str
    age: int
    email: str

# Valid data
user = User(name="Alice", age=30, email="alice@example.com")
print(user)

# Invalid data will raise a ValidationError
try:
    User(name="Bob", age="twenty", email="bob@invalid")
except Exception as e:
    print(e)


class Product(BaseModel):
    id: int = Field(..., gt=0, description="Unique product identifier")
    name: str = Field(..., min_length=2, max_length=100)
    price: float = Field(..., gt=0.0)
    description: Optional[str] = None # Optional field
    seller_email: EmailStr # Pydantic's built-in email validation

product = Product(id=1, name="Laptop", price=1200.50, seller_email="seller@store.com")
print(product)

Create Pydantic Model

Directly use Constructor with unpacked dictionary, or model_validate do validate and convert dict to model.

model_validate_json do validate and convert JSON string to a model.

user_data = {
    "name": "Alice",
    "age": 30,
    "email": "alice@example.com"
}
user_model = User(**user_data)

user_model = User.model_validate(user_data)


class Movie(BaseModel):
    title: str
    year: int
    director: str
    genres: list[str]

# Your JSON string data
json_string = '''
{
    "title": "Inception",
    "year": 2010,
    "director": "Christopher Nolan",
    "genres": ["Sci-Fi", "Action", "Thriller"]
}
'''
movie_model = Movie.model_validate_json(json_string)

Validate dictionary and JSON string: model_validate(), model_validate_json()

model_validate: validate a Python dictionary model_validate_json: validate a JSON string

from pydantic import BaseModel
import json

class Item(BaseModel):
    name: str
    quantity: int

data_dict = {"name": "Apple", "quantity": 5}
item1 = Item.model_validate(data_dict)
print(item1)

json_data = '{"name": "Banana", "quantity": 10}'
item2 = Item.model_validate_json(json_data)
print(item2)

Serialization: model_dump(), model_dump_json().

model_dump: to Python dictionary model_dump_json: to JSON

from pydantic import BaseModel

class City(BaseModel):
    name: str
    population: int

tokyo = City(name="Tokyo", population=14000000)
print(tokyo.model_dump())
print(tokyo.model_dump_json(indent=2)) # Pretty print JSON

Custom Validators: @field_validator, @model_validator

from datetime import date
from pydantic import BaseModel, ValidationError, field_validator, model_validator

class Event(BaseModel):
    name: str
    start_date: date
    end_date: date

    @field_validator('name')
    @classmethod
    def check_name_is_not_empty(cls, v):
        if not v.strip():
            raise ValueError('Event name cannot be empty')
        return v

    @model_validator(mode='after') # 'after' means after field validation
    def check_dates_order(self):
        if self.start_date > self.end_date:
            raise ValueError('Start date must be before end date')
        return self

try:
    event1 = Event(name="Conference", start_date="2025-07-20", end_date="2025-07-22")
    print(event1)
except ValidationError as e:
    print(e)

try:
    Event(name="Bad Event", start_date="2025-07-25", end_date="2025-07-23")
except ValidationError as e:
    print(e)

Nested Models

from pydantic import BaseModel
from typing import List

class Address(BaseModel):
    street: str
    city: str
    zip_code: str

class Customer(BaseModel):
    customer_id: int
    name: str
    shipping_addresses: List[Address]

customer_data = {
    "customer_id": 123,
    "name": "Jane Doe",
    "shipping_addresses": [
        {"street": "123 Main St", "city": "Anytown", "zip_code": "12345"},
        {"street": "456 Oak Ave", "city": "Otherville", "zip_code": "67890"}
    ]
}

customer = Customer.model_validate(customer_data)
print(customer)

JSON Schema Generation

from pydantic import BaseModel

class Task(BaseModel):
    id: int
    title: str
    completed: bool = False

print(Task.model_json_schema(indent=2))

References

document on how to use it.

Python Type Hint

2024年3月10日 · 約2分

Mikyan

白い柴犬

Since Python 3.5, Python introduced Type hint. And it become more and more powerful.

With it You can set the type for your variable for readibility.

Type hints are hints, not enforcements. Python still runs the code even if types don't match.

Usage

# Primitives
name: str = "Tom"
age: int = 30
salary: float = 500.5
is_active: bool = True

# Collections
numbers: list = [1,2,3]
scores: tuple = (90, 85, 88)
unique: set = {1, 2, 3}
data: dict = {"key": "value"}


# Specific Collection Types

from typing import List, Dict, Tuple, Set

names: List[str] = ["Alice", "Bob", "Charlie"]
user: Dict[str, str] = {
    "name": "John",
    "email": "john@example.com"
}
person: Tuple[str, int, bool] = ("Alice", 30, True)
unique_ids: Set[int] = {1, 2, 3, 4, 5}

# after python 3.9 the following are also work
names: list[str] = ["Alice", "Bob", "Charlie"]
user: dict[str, str] = {
    "name": "John",
    "email": "john@example.com"
}person: tuple[str, int, bool] = ("Alice", 30, True)
unique_ids: set[int] = {1, 2, 3, 4, 5}

# Optional

from typing import Optional

# can be string or None
middle_name: Optional[str] = None

# Union 
from typing import Union
number: Union[int, float] = 10
number = 10.5


# Literal for exact values
from typing import Literal
Status = Literal["pending", "approved", "rejected"]

def process_order(status: Status) -> None:
    pass

# TypedDict
from typing import TypedDict
# TypedDict for dictionary structures
class UserDict(TypedDict):
    name: str
    age: int
    email: str


# Class
user: User = get_user(123)

# method
def calculate_bmi(weight: float, height: float) -> float:
    return weight / (height ** 2)

# Self
from typing import Self

class User:
    def copy(self) -> Self:  # Returns same class type
        return User()

How to setup SFTP in EC2 allowing file upload

How to setup a sftp user allowing it to upload files into specific folder

Best Practice

7-habbit-01

How to do one thing well

Science

やって良かったの三つのマインドセット

7つの習慣ー主体的である

Python File System Operations

Details

Use `pathlib` is the modern, oo way to handle file paths, and operations.

Create a Path Object

Dealing with file name

Check / Query File System

Operations

File Content I/O

directory iteration / traversal

Use Context Managers (with open(...)) for File I/O

Use the `tempfile` Module for Temporary Files/Directories

Use shutil for High-level Operations

Zip / Tarring

Copy File Metadata

Getting Disk Usage

Do not

Python Async Programming

Concept

Why async def functions can be paused

Why Async is important for web framework

Python Pydantic

Details

Inheriting from BaseModel, Python classes become Pydantic models

Create Pydantic Model

Validate dictionary and JSON string: model_validate(), model_validate_json()

Serialization: model_dump(), model_dump_json().

Custom Validators: @field_validator, @model_validator

Nested Models

JSON Schema Generation

References

Python Type Hint

Usage

How to setup a sftp user allowing it to upload files into specific folder

Science

Details

Use pathlib is the modern, oo way to handle file paths, and operations.​

Create a Path Object​

Dealing with file name​

Check / Query File System​

Operations​

File Content I/O​

directory iteration / traversal​

Use Context Managers (with open(...)) for File I/O​

Use the tempfile Module for Temporary Files/Directories​

Use shutil for High-level Operations​

Zip / Tarring​

Copy File Metadata​

Getting Disk Usage​

Do not​

Concept

Why async def functions can be paused​

Why Async is important for web framework

Details

Inheriting from BaseModel, Python classes become Pydantic models​

Create Pydantic Model​

Validate dictionary and JSON string: model_validate(), model_validate_json()​

Serialization: model_dump(), model_dump_json().​

Custom Validators: @field_validator, @model_validator​

Nested Models​

JSON Schema Generation​

References

Usage

Use `pathlib` is the modern, oo way to handle file paths, and operations.

Create a Path Object

Dealing with file name

Check / Query File System

Operations

File Content I/O

directory iteration / traversal

Use Context Managers (with open(...)) for File I/O

Use the `tempfile` Module for Temporary Files/Directories

Use shutil for High-level Operations

Zip / Tarring

Copy File Metadata

Getting Disk Usage

Do not

Why async def functions can be paused

Inheriting from BaseModel, Python classes become Pydantic models

Create Pydantic Model

Validate dictionary and JSON string: model_validate(), model_validate_json()

Serialization: model_dump(), model_dump_json().

Custom Validators: @field_validator, @model_validator

Nested Models

JSON Schema Generation