Skip to content

Commit 3f3a41b

Browse files
committed
Reworked parser using c++11 filesystem.
1 parent d8fa145 commit 3f3a41b

File tree

14 files changed

+706
-211
lines changed

14 files changed

+706
-211
lines changed
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Faust Parser Modernization Plan
2+
3+
## 1. Current-State Analysis
4+
- **Legacy Path Handling**`parser/enrobage.cpp` manually composes file paths using `fopen`, `chdir`, and raw `getcwd` buffers, storing search roots as strings inside `global::gImportDirList`. This approach is brittle (no RAII), thread-hostile (process-wide directory changes), and ignores Unicode or platform-specific nuances.
5+
- **Duplicated Utilities** – Helpers such as `fileBasename`, `fileDirname`, and `buildFullPathname` reimplement functionality offered by `std::filesystem`, relying on ad-hoc separator macros and custom logic for Windows drive prefixes.
6+
- **FILE*-Centric IO**`SourceReader` and related components still pivot on `FILE*` and C-level globals (`FAUSTin`, `FAUSTfilename`). That complicates error handling, makes resource management manual, and prevents reuse of modern C++ features.
7+
- **Coupled Import Resolution** – URL checks, file checks, and directory stack updates are interwoven, making it hard to test or evolve the import process independently of the parser.
8+
9+
## 2. Rewrite Objectives (C++17 Filesystem)
10+
1. **Adopt `std::filesystem::path`** for all path manipulation: parent directories, basenames, stems, and canonicalisation.
11+
2. **Eliminate `chdir`/`getcwd` usage** by joining paths (`dir / filename`) and probing with `fs::exists`, `fs::canonical`, or `fs::weakly_canonical`.
12+
3. **Transition to RAII streams** (`std::ifstream`, `std::ofstream`) while keeping Flex interfaces working via adapters (e.g. a thin helper that opens a `FILE*` using the resolved `fs::path` only where the lexer requires it).
13+
4. **Centralise resolution logic** in a dedicated service (e.g. `FileResolver`) that operates on vectors of `fs::path`, keeps caches of resolved imports, and returns structured errors.
14+
5. **Refine global state** to store search directories as `fs::path` and expose helper methods (`addImportDir`, `importDirectories`, etc.) so other subsystems can interact without string munging.
15+
6. **Improve diagnostics** by reporting canonical paths and preserving original requests, especially when multiple search roots are involved.
16+
17+
## 3. Testing Strategy (Prepare Before Refactoring)
18+
- Establish this coverage up front so baseline behaviour is locked before any rewrite begins.
19+
### Unit Tests
20+
- **FileResolver coverage**: mock search roots, verify resolution precedence, canonicalisation, and cache updates. Include edge cases (relative traversals, symlinks, non-existent files).
21+
- **Path Utilities**: tests for metadata extraction (`stem`, `filename`, `parent_path`) across POSIX and Windows semantics.
22+
- **Error Handling**: ensure descriptive exceptions/messages when resolution fails or permissions are insufficient.
23+
24+
### Integration Tests
25+
- **Parser Imports**: run the parser on sample projects with nested imports, relative paths, and architecture lookups; compare generated AST/signals before and after the refactor.
26+
- **Mixed Protocols**: confirm that `file://` URLs map correctly to local paths and that HTTP imports still pass through fetcher stubs.
27+
- **Regression Suite**: reuse existing compiler tests (`make check`, existing CI scripts) to guard against behavioural deviations.
28+
29+
### Cross-Platform Validation
30+
- Validate on macOS/Linux and Windows, paying attention to drive-letter handling, UNC paths, and symlink resolution.
31+
- Run tests under both UTF-8 and locale-sensitive directories (paths containing non-ASCII characters) to ensure Unicode-safe behaviour.
32+
33+
### Performance Checks
34+
- Benchmark import-heavy workloads to ensure the resolver’s canonicalisation and caching do not introduce regressions.
35+
- Profile optional features (e.g. repeated inclusion detection) to confirm they function with the new path abstraction.
36+
37+
## 4. Proposed Refactoring Steps
38+
1. **Introduce `FileResolver` Module**
39+
- Encapsulate search roots, canonicalisation, and relative path updates.
40+
- Provide APIs: `resolve_import`, `resolve_architecture`, `add_search_root`, `resolved_history`.
41+
2. **Migrate Utility Functions**
42+
- Replace `fileBasename`, `fileDirname`, `buildFullPathname`, and `isAbsolutePathname` with `std::filesystem` equivalents.
43+
- Update metadata extraction in `SourceReader` to derive `stem`, `filename`, and `parent_path` through the new API.
44+
3. **Rewrite `fopenSearch` and Friends**
45+
- Reimplement import search using `FileResolver` without `chdir`.
46+
- Ensure the directory auto-discovery (adding containing folder after a successful import) uses `fs::path::parent_path()`.
47+
4. **Transition `SourceReader` IO**
48+
- Wrap `std::ifstream` in a RAII struct that exposes `FILE*` to Flex or integrate Flex's alternative `yyset_in`.
49+
- Convert `gGlobal->gInputFiles` and related lists to store `fs::path` (or strings derived from them on demand).
50+
5. **Decouple Network Fetching**
51+
- Isolate the URL-handling code path so local filesystem logic does not depend on `http_fetch`.
52+
6. **Update Global Initialisation**
53+
- Canonicalise directories during startup; ensure serialization/deserialization of paths works across platforms.
54+
7. **Retire Legacy Helpers**
55+
- Remove obsolete path utilities once all consumers use the modern code.
56+
8. **Documentation & Migration Notes**
57+
- Record API changes and migration steps for downstream tools or scripts that rely on the old behaviour.
58+
59+
## 5. Prompt Sequence for Incremental Implementation
60+
0. **Establish Testing Baseline**
61+
- Prompt: “Set up the parser testing harness: add unit-test scaffolding for `FileResolver`, create sample DSP projects for integration checks, and build a Makefile-driven test target developers can run locally before each step.”
62+
- Tests: run the new make-based unit/integration suite to capture baseline results; document any gaps or flaky cases.
63+
1. **Create FileResolver Skeleton**
64+
- Prompt: “Implement a `parser/FileResolver` class using `std::filesystem` to manage search directories and resolve import paths; include initial unit tests.”
65+
- Tests: run new unit tests for resolver logic.
66+
2. **Refactor Path Utilities**
67+
- Prompt: “Replace legacy basename/dirname helpers with `std::filesystem` operations throughout `parser/enrobage.cpp`; adjust include paths and add coverage for metadata extraction.”
68+
- Tests: run resolver tests, plus targeted parser unit tests if available.
69+
3. **Reimplement `fopenSearch` Flow**
70+
- Prompt: “Rewrite `fopenSearch` and related helpers to use `FileResolver`; ensure the global import directory list is updated via the new API.”
71+
- Tests: resolver tests + parser integration sample (compile a simple Faust file).
72+
4. **Modernise `SourceReader` IO**
73+
- Prompt: “Update `SourceReader` to open files via `std::ifstream` (or a RAII adapter) and feed the lexer accordingly; replace raw strings with `fs::path` where feasible.”
74+
- Tests: parser integration tests (import scenarios, error reporting).
75+
5. **Adapt Global State**
76+
- Prompt: “Modify `global` to store search directories as `std::filesystem::path` values and expose helper methods for mutation.”
77+
- Tests: run compiler CLI smoke tests, verify options like `-I` and `-A`.
78+
6. **Isolate Network Fetching**
79+
- Prompt: “Extract URL-handling into a dedicated layer, ensuring filesystem resolver paths remain clean; adjust tests accordingly.”
80+
- Tests: integration tests covering HTTP/file imports (using stubs).
81+
7. **Cleanup & Documentation**
82+
- Prompt: “Remove deprecated helpers, update documentation/README, and note migration requirements; ensure full test suite passes.”
83+
- Tests: entire regression suite, documentation lint if applicable.
84+
85+
Each step should produce targeted documentation updates (e.g. resolver README, migration notes) and confirm via designated tests before moving on. This keeps the rewrite incremental, testable, and reviewable.
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Parser Testing Baseline Setup
2+
3+
- Introduced `parser/file_resolver.hh` with a placeholder `FileResolver` interface so the upcoming rewrite has a compilation target while real logic is developed.
4+
- Created `tests/parser/` harness:
5+
- `Makefile` providing `make run` and `make baseline` targets for local execution.
6+
- `README.md` documenting harness usage.
7+
- Fixtures under `fixtures/` (currently a simple DSP file and an architecture directory placeholder) to back future integration tests.
8+
- Added `tests/parser/test_file_resolver.cpp`, which:
9+
- Checks that import/architecture directories can be registered.
10+
- Confirms `resolveImport` currently throws until implemented, preserving visibility of unimplemented sections.
11+
- Captured baseline results by running `cd tests/parser && make clean && make run`; the suite builds the test binary and reports successful completion.
12+
13+
## FileResolver Implementation (Step 1)
14+
- Replaced the placeholder with a filesystem-aware resolver that:
15+
- Normalises search roots, removes duplicates, and ignores URL-style inputs.
16+
- Resolves relative/absolute filenames against the registered import or architecture directories, returning canonical paths when matches are found.
17+
- Expanded `tests/parser/test_file_resolver.cpp` to validate:
18+
- Directory registration and duplicate filtering.
19+
- Resolution of relative, normalised, absolute, and direct paths.
20+
- Architecture lookups using a new `fixtures/arch/minimal.arch`.
21+
- Failure cases (missing files, URL schemes).
22+
- Added the minimal architecture fixture to support the tests.
23+
- Re-ran `cd tests/parser && make clean && make run` to confirm all resolver tests pass.
24+
25+
## Path Utility Refactor (Step 2)
26+
- Modernised `parser/enrobage` path helpers to rely on `std::filesystem`:
27+
- `fileBasename`/`fileDirname` now operate on `std::string` inputs with inline implementations using `fs::path`.
28+
- Legacy manual separator handling and ad-hoc absolute-path detection were replaced with `std::filesystem` queries.
29+
- `buildFullPathname` now builds canonical absolute paths via `fs::absolute`/`fs::weakly_canonical`.
30+
- Updated call sites (Documentator, parser tests) to use the new signatures.
31+
- Extended the parser test harness with coverage for the path helpers.
32+
- Executed `cd tests/parser && make clean && make run` to verify the suite after the refactor.
33+
34+
## Resolver-Driven File Opening (Step 3)
35+
- Replaced the legacy `chdir`-based search loops in `parser/enrobage.cpp` with `FileResolver`-powered lookups for both imports and architectures.
36+
- Added canonical directory tracking helpers so directories discovered via successful resolutions are re-inserted into `gGlobal` without duplication.
37+
- Updated `openArchStream`/`fopenSearch` to open canonical paths returned by the resolver and drop the manual `buildFullPathname`/`fopenAt` chain.
38+
- Existing `global` directory lists now seamlessly feed the resolver, keeping CLI behaviour intact while benefiting from canonicalisation at open time.
39+
- Re-ran `cd tests/parser && make clean && make run` to ensure the harness passes with the new resolver-integrated flow.
40+
41+
## Directory utility helpers (Step 4)
42+
- Extracted canonical directory logic into shared helpers inside `parser/file_resolver.hh` so both the resolver and tests share a single implementation.
43+
- Updated tests to verify duplicate suppression and canonicalisation behaviour for the new helper.
44+
- Re-ran `cd tests/parser && make clean && make run` to confirm the suite still passes.
45+
46+
## Canonical global search paths (Step 5)
47+
- Normalised every directory added to `gImportDirList` and `gArchitectureDirList` (command-line, env vars, defaults) via the shared helper so the resolver receives canonical strings.
48+
- Updated master/Faust directory bookkeeping to store canonicalised paths up front.
49+
- Re-ran `cd tests/parser && make clean && make run` to ensure the harness still passes.
50+
51+
## Current status
52+
- Parser still relies on string-based globals; lifetime fixes remain pending after identifying crash regressions.
53+
- FileResolver integrations cover import/architecture lookups, but SourceReader still uses legacy FILE* management.
54+
55+
## Suggested next steps
56+
1. Build focused SourceReader tests using temporary files to reproduce pointer-lifetime bugs deterministically.
57+
2. Introduce a small parser context struct carrying canonical filename strings instead of mutating global state directly.
58+
3. Replace remaining raw `FILE*` usage with RAII wrappers once tests confirm the approach.
59+
4. Extend resolver helpers to feed other subsystems (Documentator, libcode) so canonicalisation is guaranteed end-to-end.
60+
5. Consider a high-level integration test that compiles representative DSPs to catch regressions before releases.

compiler/documentator/doc.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -362,8 +362,7 @@ static void printFaustListing(string& faustfile, ostream& docout)
362362
src.open(faustfile.c_str(), ifstream::in);
363363

364364
docout << endl << "\\bigskip\\bigskip" << endl;
365-
docout << "\\begin{lstlisting}[caption=\\texttt{" << fileBasename(faustfile.c_str()) << "}]"
366-
<< endl;
365+
docout << "\\begin{lstlisting}[caption=\\texttt{" << fileBasename(faustfile) << "}]" << endl;
367366

368367
bool isInsideDoc = false;
369368

@@ -1046,7 +1045,7 @@ static void copyFaustSources(const char* projname, const vector<string>& pathnam
10461045
ifstream src;
10471046
ofstream dst;
10481047
string faustfile = pathnames[i];
1049-
string copy = subst("$0/$1", srcdir, fileBasename(faustfile.c_str()));
1048+
string copy = subst("$0/$1", srcdir, fileBasename(faustfile));
10501049
// cerr << "Documentator : copyFaustSources : Opening input file '" << faustfile << "'" <<
10511050
// endl; cerr << "Documentator : copyFaustSources : Opening output file '" << copy << "'" <<
10521051
// endl;

compiler/global.cpp

Lines changed: 34 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
#include "exepath.hh"
3535
#include "exp10prim.hh"
3636
#include "expprim.hh"
37+
#include "fileresolver.hh"
3738
#include "floats.hh"
3839
#include "floorprim.hh"
3940
#include "fmodprim.hh"
@@ -117,6 +118,7 @@
117118
#endif
118119

119120
using namespace std;
121+
using faust::parser::canonicalDirectoryString;
120122

121123
#ifndef AP_INT_MAX_W
122124
#define AP_INT_MAX_W 1024
@@ -1553,7 +1555,7 @@ bool global::processCmdline(int argc, const char* argv[])
15531555
if (path) {
15541556
// We want to search user given directories *before* the standard ones, so
15551557
// insert at the beginning
1556-
gImportDirList.insert(gImportDirList.begin(), path);
1558+
gImportDirList.insert(gImportDirList.begin(), canonicalDirectoryString(path));
15571559
}
15581560
}
15591561
i += 2;
@@ -1565,7 +1567,7 @@ bool global::processCmdline(int argc, const char* argv[])
15651567
char temp[PATH_MAX + 1];
15661568
char* path = realpath(argv[i + 1], temp);
15671569
if (path) {
1568-
gArchitectureDirList.push_back(path);
1570+
gArchitectureDirList.push_back(canonicalDirectoryString(path));
15691571
}
15701572
}
15711573
i += 2;
@@ -1885,6 +1887,8 @@ void global::initDocumentNames()
18851887
gDocName = fxName(gMasterDocument);
18861888
}
18871889

1890+
gMasterDirectory = canonicalDirectoryString(gMasterDirectory);
1891+
18881892
// Add gMasterDirectory in gImportDirList and gArchitectureDirList
18891893
gImportDirList.push_back(gMasterDirectory);
18901894
gArchitectureDirList.push_back(gMasterDirectory);
@@ -1896,45 +1900,49 @@ void global::initDirectories(int argc, const char* argv[])
18961900
char s[1024];
18971901
getFaustPathname(s, 1024);
18981902

1899-
gFaustExeDir = exepath::get(argv[0]);
1900-
gFaustRootDir = exepath::dirup(gFaustExeDir);
1901-
gFaustDirectory = fileDirname(s);
1902-
gFaustSuperDirectory = fileDirname(gFaustDirectory);
1903-
gFaustSuperSuperDirectory = fileDirname(gFaustSuperDirectory);
1903+
gFaustExeDir = canonicalDirectoryString(exepath::get(argv[0]));
1904+
gFaustRootDir = canonicalDirectoryString(exepath::dirup(gFaustExeDir));
1905+
gFaustDirectory = canonicalDirectoryString(fileDirname(s));
1906+
gFaustSuperDirectory = canonicalDirectoryString(fileDirname(gFaustDirectory));
1907+
gFaustSuperSuperDirectory = canonicalDirectoryString(fileDirname(gFaustSuperDirectory));
19041908

19051909
//-------------------------------------------------------------------------------------
19061910
// init gImportDirList : a list of path where to search .lib files
19071911
//-------------------------------------------------------------------------------------
19081912
if (char* envpath = getenv("FAUST_LIB_PATH")) {
1909-
gImportDirList.push_back(envpath);
1913+
gImportDirList.push_back(canonicalDirectoryString(envpath));
19101914
}
19111915
#ifdef INSTALL_PREFIX
1912-
gImportDirList.push_back(INSTALL_PREFIX "/share/faust");
1916+
gImportDirList.push_back(canonicalDirectoryString(INSTALL_PREFIX "/share/faust"));
19131917
#endif
1914-
1915-
gImportDirList.push_back(exepath::dirup(gFaustExeDir) + "/share/faust");
1916-
gImportDirList.push_back("/usr/local/share/faust");
1917-
gImportDirList.push_back("/usr/share/faust");
1918+
gImportDirList.push_back(
1919+
canonicalDirectoryString(exepath::dirup(gFaustExeDir) + "/share/faust"));
1920+
gImportDirList.push_back(canonicalDirectoryString("/usr/local/share/faust"));
1921+
gImportDirList.push_back(canonicalDirectoryString("/usr/share/faust"));
19181922

19191923
//-------------------------------------------------------------------------------------
19201924
// init gArchitectureDirList : a list of path where to search architectures files
19211925
//-------------------------------------------------------------------------------------
19221926
if (char* envpath = getenv("FAUST_ARCH_PATH")) {
1923-
gArchitectureDirList.push_back(envpath);
1927+
gArchitectureDirList.push_back(canonicalDirectoryString(envpath));
19241928
}
1925-
gArchitectureDirList.push_back(gFaustDirectory + "/architecture");
1926-
gArchitectureDirList.push_back(gFaustSuperDirectory + "/architecture");
1927-
gArchitectureDirList.push_back(gFaustSuperSuperDirectory + "/architecture");
1929+
gArchitectureDirList.push_back(canonicalDirectoryString(gFaustDirectory + "/architecture"));
1930+
gArchitectureDirList.push_back(
1931+
canonicalDirectoryString(gFaustSuperDirectory + "/architecture"));
1932+
gArchitectureDirList.push_back(
1933+
canonicalDirectoryString(gFaustSuperSuperDirectory + "/architecture"));
19281934
#ifdef INSTALL_PREFIX
1929-
gArchitectureDirList.push_back(INSTALL_PREFIX "/share/faust");
1930-
gArchitectureDirList.push_back(INSTALL_PREFIX "/include");
1931-
#endif
1932-
gArchitectureDirList.push_back(exepath::dirup(gFaustExeDir) + "/share/faust");
1933-
gArchitectureDirList.push_back(exepath::dirup(gFaustExeDir) + "/include");
1934-
gArchitectureDirList.push_back("/usr/local/share/faust");
1935-
gArchitectureDirList.push_back("/usr/share/faust");
1936-
gArchitectureDirList.push_back("/usr/local/include");
1937-
gArchitectureDirList.push_back("/usr/include");
1935+
gArchitectureDirList.push_back(canonicalDirectoryString(INSTALL_PREFIX "/share/faust"));
1936+
gArchitectureDirList.push_back(canonicalDirectoryString(INSTALL_PREFIX "/include"));
1937+
#endif
1938+
gArchitectureDirList.push_back(
1939+
canonicalDirectoryString(exepath::dirup(gFaustExeDir) + "/share/faust"));
1940+
gArchitectureDirList.push_back(
1941+
canonicalDirectoryString(exepath::dirup(gFaustExeDir) + "/include"));
1942+
gArchitectureDirList.push_back(canonicalDirectoryString("/usr/local/share/faust"));
1943+
gArchitectureDirList.push_back(canonicalDirectoryString("/usr/share/faust"));
1944+
gArchitectureDirList.push_back(canonicalDirectoryString("/usr/local/include"));
1945+
gArchitectureDirList.push_back(canonicalDirectoryString("/usr/include"));
19381946

19391947
// for debugging purposes
19401948
// cerr << "gArchitectureDirList:\n";

0 commit comments

Comments
 (0)