This repository provides a modular pipeline to stitch and geometrically correct a mosaic of microscope tiles using stage coordinates, neighbor detection, SIFT-based matching on overlap crops, pixel→micron conversion, and a weighted least-squares optimization to refine tile positions.
- Load tile positions from a CSV into a clean DataFrame with standardized columns
tile_name, x, y. - Infer neighbors (8-connected) from stage coordinates with DBSCAN gating and KD-Tree range queries to avoid cross-cluster false links; estimate nominal step sizes.
- Load tile images (grayscale or color) keyed by
tile_nameand paired with their stage coordinates for downstream cropping. - Crop overlap regions per direction (left/right/up/down + 4 diagonals) to focus matching on informative borders.
- Match overlap crops with SIFT, compute per-pair affine transforms and inlier counts via RANSAC, and collect displacements and quality stats.
- Convert pixel displacements to microns, estimating a global microns-per-pixel from stage deltas and affine translations; filter out invalid results.
- Solve a weighted position refinement, building a linear system from pairwise rules (dx, dy) + Tikhonov regularization + an optional anchor tile; multiple weighting schemes are supported.
- Package imports are exposed in the package’s
__init__.pyto simplify user code.
- Tiles are identified by
tile_nameand have stage coordinates(x, y)in microns (or stage units). The CSV loader normalizes to integerx, ycolumns and trims names. - Neighbor graph is 8-connected (cardinals + diagonals) when tiles fall within tolerances of the median grid steps (
step_x, step_y). Outliers/noise can be clustered away by DBSCAN. - Images are read from a directory and stored as a dict:
tile_name -> ((x, y), image), enabling crop logic to use both image content and known position. - Overlap crops are derived from per-direction slices controlled by
overlap_fractionandextra_factor. - Matches store SIFT keypoints, “good” matches (Lowe’s ratio), estimated affine, displacement
(dx, dy), rotation, and inlier counts. - Unit conversion computes a single global microns-per-pixel using stage deltas between a matched pair and the affine translation magnitude, then applies it to all
(dx, dy). - Optimization treats each pairwise relation as a “rule” with weight
w, and solves a regularized least-squares system for refined(x, y)of all tiles, optionally anchoring one tile.
- Purpose: Read microscope tile stage coordinates and standardize field names.
- Key behavior: Maps CSV columns
Tile, X, Y→tile_name, x, y; trims names; casts coordinates toint. - Output:
pd.DataFrame[['tile_name','x','y']].
-
Purpose: Build an 8-connected neighbor map using geometry rather than filenames.
-
How it works:
- Estimates median spacing along each axis (
step_x,step_y). - Uses DBSCAN to isolate clusters and prevent linking distant tiles.
- Within each cluster, KD-Tree radius queries find nearby tiles; directional assignment uses sign and tolerance checks on
(dx, dy)against the step sizes. - Optionally draws a vector visualization of neighbor directions.
- Estimates median spacing along each axis (
-
Returns:
(neighbor_map, step_x, step_y).
- Purpose: Load grayscale or color images for a set of tiles listed in the DataFrame.
- I/O shape: Returns a dict
tile_name -> ((x, y), image)for either grayscale (IMREAD_GRAYSCALE) or color (IMREAD_COLOR), emitting warnings for failures. - Notes: Relies on exact
tile_name→ filename matches; trimstile_nameto avoid whitespace mismatches.
-
Purpose: Extract only the likely overlapping borders to make SIFT faster and cleaner.
-
Mechanics:
- Cleans both
tile_dataandneighborskeys/values (trimming names). - Computes crop widths/heights as
overlap_fraction + extra_factorof image dimensions. - Produces per-tile dict of patches keyed by direction.
- Cleans both
-
Directions covered: left, right, up, down, top_left, top_right, bottom_left, bottom_right.
-
Purpose: For each tile/direction, match its cropped patch to the opposite-side crop of its neighbor and estimate an affine transform.
-
Pipeline specifics:
- SIFT feature detection + descriptors; BFMatcher with KNN and Lowe’s ratio test (
ratio_thresh). - Opposite-direction mapping ensures correct patch pairing (e.g., tile A’s “right” vs tile B’s “left”).
- Computes
AffinePartial2Dwith RANSAC; records translation(dx, dy), rotation in degrees, andnum_inliers. - Skips pairs with too few matches (
min_matches).
- SIFT feature detection + descriptors; BFMatcher with KNN and Lowe’s ratio test (
-
Outputs: A dict keyed by
(tile_i, tile_j)with per-pair metadata and quality stats.
-
Purpose: Convert per-pair pixel displacements to microns using an estimated global scale; drop invalid pairs.
-
Method:
- Finds the first pair with a valid affine matrix and known stage positions; computes microns-per-pixel from the ratio of stage delta magnitude to affine translation magnitude.
- If no valid pair exists, defaults to
1.0and warns. - Applies conversion to all pairs and returns a filtered dict with
dx_microns,dy_microns, and the chosen scale.
-
Purpose: Integrate all pairwise displacements into a single, globally consistent correction of tile coordinates.
-
Steps:
- Weights: Multiple schemes (
raw_inliers,sqrt_inliers,inlier_ratio,hybrid,log_capped,uniform) computed from match quality; optional normalization. - Rules: For each matched pair
(i, j), build constraints enforcingx_j - x_i ≈ dx_micronsandy_j - y_i ≈ dy_microns, scaled by weight. - Regularization: Penalize deviation from noisy stage points with λ (Tikhonov), stabilizing under sparse/uneven coverage.
- Anchoring: Optionally fix the first point to remove global translation ambiguity.
- Solve: Linear least-squares; optionally visualize original vs corrected positions and residual vectors.
- Weights: Multiple schemes (
-
Returns:
corrected_pointsand residuals. A conveniencerun(...)composes the full workflow.
- Purpose: Re-exports core entry points for convenient imports from the package root (coordinates loading, neighbor detection, image loading, cropping, conversion).
-
File naming:
tile_namemust match the on-disk image filename; the loaders do a.strip()on names, but not more. If images fail to load, verify exact names and extensions. -
Grid irregularities: If stage coordinates are noisy or sparse, DBSCAN parameters (
eps,min_samples) and neighbor tolerance may need tuning to avoid cross-row/column links. -
Crop sizing: Increase
overlap_fractionorextra_factorif matches are weak or borders are too thin; reduce them for speed once stability is confirmed. -
SIFT thresholds: If you get many “No descriptors” or too few matches, consider adjusting
ratio_threshor ensuring sufficient texture in the overlap zones. -
Scale estimation: The microns-per-pixel is derived from the first valid pair; ensure at least one reliable match exists. If not, the module falls back to 1.0 (your optimization will still run, but in pixel units).
-
Weighting choice:
- Use
hybrid(default) to emphasize both inlier count and match purity. - Switch to
uniformfor ablation or to debug the effect of matcher noise. log_cappedcan temper a few very strong links dominating the solution.
- Use
-
Regularization λ: Larger λ keeps refined points closer to stage coordinates (useful when matches are sparse or uneven); smaller λ lets matches drive bigger corrections.
-
Anchoring: Keep
fix_tile=Trueunless you explicitly post-shift the whole solution; otherwise the system is underdetermined for translation.
- Load CSV →
dfwithtile_name, x, y. - Infer
(neighbor_map, step_x, step_y)fromdf. - Load grayscale images →
tile_data. - Crop overlaps →
cropped_patches. - Match neighbors →
match_resultswith affine and inlier stats. - Convert to microns + filter →
filtered_matchesandmicrons_per_pixel. - Optimize → refined coordinates
corrected_points(optionally plot). - (Optional) Import helpers directly from package root per
__init__.py.
- Neighbor detection:
tolerance(fraction of step),dbscan_eps,dbscan_min_samples,visualize. - Crops:
overlap_fraction(e.g., 0.15),extra_factor(e.g., 0.10). - SIFT matching:
ratio_thresh(e.g., 0.75),min_matches(e.g., 3). - Conversion:
anchor(reserved), returnsmicrons_per_pixel. - Optimization:
weight_method,normalize,lambda_reg,fix_tile,visualize.
- Neighbor graph: dict of
tile_name -> directions -> neighbor_name. - Overlap crops: per-tile dict of border images by direction.
- Match summary: per-pair metadata including
num_matches,num_inliers,affine_matrix,(dx, dy),rotation_deg. - Converted matches:
dx_microns,dy_microns, globalmicrons_per_pixel. - Refined coordinates:
corrected_pointsaligned in a consistent global frame, plus residuals.
- “Missing patch” or
Noneneighbor: Check neighbor detection tolerances and clustering; ensure tiles actually overlap at expected step sizes. - Few or zero SIFT matches: Increase crop size, verify image contrast on borders, relax
ratio_thresh, or switch to grayscale inputs. - Scale seems off: Ensure at least one high-quality pair exists for pixel→micron estimation; otherwise you’ll see a warning and a fallback to 1.0.
- Warped global solution: Try
log_cappedoruniformweights, increaselambda_reg, or check for mislabeled neighbors that inject contradictory rules.
- Languages: Python
- Libraries: OpenCV (SIFT, RANSAC), scikit-learn (DBSCAN, KD-Tree), NumPy, Pandas, Optuna (HPO), Matplotlib
- Optimization: Regularized least-squares, multi-objective NSGA-II
- Workflow: Modular design with dataset registry, CV folds, and HPO orchestration