Skip to content

Conversation

@Umang01-hash
Copy link
Member

Pull Request Template

Description:

  • Added GCS file provider in pkg/datasource/file/gcs .
  • Implemented Create , Remove , ReadDir , Open ,Stat and MakeDir using cloud.google.com/go/storage .
  • Simulates directories using object key prefixes .

Checklist:

  • I have formatted my code using goimport and golangci-lint.
  • All new code is covered by unit tests.
  • This PR does not decrease the overall code coverage.
  • I have reviewed the code comments and documentation for clarity.

Fixes #2013

NOTE: This PR is a continuation of PR #2117. Most of the work here belongs to Suryakantdsa. I only extended it to include common logger and metrics for File Systems.

Suryakantdsa and others added 30 commits October 3, 2025 23:53
@akshat-kumar-singhal
Copy link
Contributor

@Umang01-hash Consider having a smaller interface embedded in the implementation of the FileSystem Interface.
Example:

type StorageProvider interface {
  func fetch(path string)
  func save(data []byte, path string, config any)
}

This can be implemented by GCS etc

@Umang01-hash
Copy link
Member Author

Refactoring: Unified Storage Architecture

We've completed a comprehensive refactoring that introduces a unified architecture for all file storage providers (GCS, S3, FTP, SFTP).


📋 Key Changes

1️⃣ Introduced StorageProvider Interface

A stateless interface that all storage backends must implement:

type StorageProvider interface {
    NewReader(ctx context.Context, name string) (io.ReadCloser, error)
    NewRangeReader(ctx context.Context, name string, offset, length int64) (io.ReadCloser, error)
    NewWriter(ctx context.Context, name string) io.WriteCloser

    DeleteObject(ctx context.Context, name string) error
    CopyObject(ctx context.Context, src, dst string) error
    StatObject(ctx context.Context, name string) (*ObjectInfo, error)

    ListObjects(ctx context.Context, prefix string) ([]string, error)
    ListDir(ctx context.Context, prefix string) ([]ObjectInfo, []string, error)
}

2️⃣ Created Common Abstractions

  • common_fs.go - Directory operations (Mkdir, MkdirAll, ReadDir, Stat, RemoveAll, ChDir, Getwd)
  • common_file.go - File operations (Read, Write, Seek, Close, ReadAt, WriteAt)
  • row_reader.go - JSON/CSV file readers

All operations delegate to StorageProvider, eliminating duplicate code across backends.

3️⃣ Refactored GCS Implementation

GCS now uses a two-layer architecture:

  • storage_adapter.go - Implements StorageProvider (stateless GCS SDK wrapper)
  • fs.go - Connection management, authentication, retry logic + embeds CommonFileSystem

🏗️ Architecture

┌─────────────────────────────────────────┐
│        USER CODE (GoFr App)             │
└────────────┬────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────┐
│    common_fs.go (Directory Operations)  │
└────────────┬────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────┐
│    common_file.go (File Operations)     │
└────────────┬────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────┐
│    StorageProvider Interface            │
└────────────┬────────────────────────────┘
             │
     ┌───────┴───────┬─────────┬──────────┐
     ▼               ▼         ▼          ▼
┌─────────┐    ┌─────────┐ ┌───────┐ ┌────────┐
│ GCS     │    │ S3      │ │ FTP   │ │ SFTP   │
│ storage_│    │ storage_│ │storage│ │storage_│
│ adapter │    │ adapter │ │adapter│ │adapter │
└─────────┘    └─────────┘ └───────┘ └────────┘

📊 What Changed in GCS

Before: Monolithic implementation with mixed concerns

After:

  • storage_adapter.go - Pure GCS SDK operations (NewReader, NewWriter, StatObject, etc.)
  • fs.go - Connection lifecycle + embeds CommonFileSystem for all file/directory operations

@akshat-kumar-singhal @Suryakantdsa @coolwednesday @aryan-mehrotra-zs Please review the updated implementation. I am adding the tests. If you guys have any review comment will resolve that also with it.


coolwednesday
coolwednesday previously approved these changes Nov 7, 2025
@aryanmehrotra aryanmehrotra merged commit 9e61427 into development Nov 7, 2025
19 checks passed
@aryanmehrotra aryanmehrotra deleted the fix/logger_metrics_for_file_systems branch November 7, 2025 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Google Cloud Storage (GCS) integration as File System

6 participants