Skip to content

Conversation

@nemecad
Copy link

@nemecad nemecad commented Sep 4, 2025

This PR addresses some previous feedback, most notably:

  • Set-associative TLB (machine::TLB): Implements a set-associative Translation Lookaside Buffer (TLB) frontend over physical memory, handling virtual to physical translation, flush, and replacement policy.
  • Pluggable Replacement Policies (machine::TLBPolicy): Abstract TLB replacement policy interface & implementations (RAND, LRU, LFU, PLRU) for set-associative tables.
  • SV32 Page-Table Walker (machine::PageTableWalker): Performs multi-level page-table walks (SV32) in memory to resolve a virtual address to a physical one.
  • Sv32Pte Bitfield Helpers (sv32.h): SV32-specific definitions: page-table entry (PTE) bitfields, shifts/masks, and PTE to physical address helpers.
  • VirtualAddress (virtual_address.h): Lightweight VirtualAddress wrapper offering raw access, alignment checks, arithmetic, and comparisons.
  • Add supervisor CSRs and sstatus handling: supervisor CSRs (sstatus, stvec, sscratch, sepc, scause, stval, satp) and a write handler that presents sstatus as a masked view of mstatus so supervisor-visible bits stay in sync.
  • Store current privilege level in CoreState: tracking of the hart's current privilege level in CoreState so exception/return handling and visualization can read/update it from the central CoreState structure.

Tests:

  • Add SV32 page-table + TLB integration tests: a set of small assembly tests that exercise the SV32 page-table walker, SATP enablement and the new TLB code. The tests create a root page table and map a virtual page at 0xC4000000, then exercise several scenarios. The tests verify page-table walker behaviour, SATP switching and TLB caching/flush logic. Tests were written based on the consultation.

UI Components:

  • Show current privilege level in core state view:
Snímek obrazovky 2025-09-04 115127
  • Virtual memory configuration to NewDialog:
Snímek obrazovky 2025-09-04 115904
  • TLB visualization and statistics dock:
Snímek obrazovky 2025-09-04 115521
  • VM toggle and "As CPU" memory access view:
Snímek obrazovky 2025-09-04 120034

@jdupak jdupak self-requested a review September 28, 2025 17:17
@jdupak
Copy link
Collaborator

jdupak commented Sep 28, 2025

I am getting this weird zoom.
image

@jdupak
Copy link
Collaborator

jdupak commented Sep 28, 2025

Notice that address sanitizer is failing in CI.

void tlb_update(unsigned way, unsigned set, bool valid, unsigned asid, quint64 vpn, quint64 phys, bool write);

private:
const machine::TLB *tlb;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who owns this pointer?

#include <cstdint>

namespace machine {
enum TLBType { PROGRAM, DATA };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does LTB need to know this?

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

@jdupak thanks for review of interfacing to the memory model architecture.

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

From my side, the changes to the processor pipeline diagram has been applied directly to the SVG files (src/gui/windows/coreview/schemas), but current design uses DRAW.IO source (extras/core_graphics) as the authoritative source of the pipeline visualization and SVGs are generated from this file. So the commit with SVG change should include extras/core_graphics/diagram.drawio change as well or extras/core_graphics/diagram.drawio change should be commit before SVG files regeneration commit. In long term, I would lean to single SVG file with tags for conditional rendering, but we have not got to that state yet and current solution implemented by @jdupak is based on DRAW.IO and exports controlled by tagging (some documentation there docs/developer/coreview-graphics/using-drawio-diagram.md).

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

For memory view, I would not complicate it with Show virtual checkbox. I would use only switching between As CPU (VMA), Cached and Raw.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 5 times, most recently from 0bb04d1 to ca4300b Compare October 5, 2025 19:16
@nemecad
Copy link
Author

nemecad commented Oct 5, 2025

@ppisa @jdupak Thank you for your detailed feedback. I appreciate it and have made some changes based on your review. I would be grateful for any further feedback.

@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from ca4300b to a6cbf71 Compare October 19, 2025 14:48
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the dir name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no cmake logic to run these tests. I think we want to run them as cli tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment. I’ve added the CMake logic to run these as CLI tests in the commit 7a204cf.

@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from a6cbf71 to bc32933 Compare October 19, 2025 16:13
@jdupak
Copy link
Collaborator

jdupak commented Oct 19, 2025

I pushed some slight edits. Barring the issue with new tests not being run I am fine with merging this.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

I am going through the code. I have one overall remark, that there are lot of formatting changes included in functional changes. I am not reluctant to formatting changes even that I think that sometimes formatting left by human to align for example some case lines into columns etc. has some value. But formatting changes unrelated to the functional changes make review harder. So I would keep with patches as they are but I would suggest to separate formatting, even over all later modified files in series, separate from functional changes.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

I am do not like is_mmio_region() and bypass_mmio() concept. The peripherals accesses should go through regular address translation. It is responsibility of the OS to map regions related to I/O into virtual address space of kernel and or even user application, i.e. for mmap() like accesses.

As for enabling cache for accesses there is a hack in the QtRvSim which enforces next uncached region

Cache::Cache
    uncached_start(0xf0000000_addr)
    uncached_last(0xfffffffe_addr)

In the longer term, cacheability should be controlled from page tables. But PBMT (Page-Based Memory Types) are supported only for Sv39 and bigger translation configurations, see Chapter 14. "Svpbmt" Extension for Page-Based Memory Types

Mode Value Requested Memory Attributes
PMA 0 None
NC 1 Non-cacheable, idempotent, weakly-ordered (RVWMO), main memory
IO 2 Non-cacheable, non-idempotent, strongly-ordered (I/O ordering), I/O
- 3 Reserved for future standard use

But the physical region marked to skip caching in cache implementation (current state) should be enough for now.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

Not so critical for now, but should be solved in the longer time perspective. SRET can be executed even in M mode. So the type of the return should be propagated to the control_state->exception_return in the Core::memory(const ExecuteInterstage &dt). It is question if to add signal which goes through all stages (more readable) or to use bit from instruction for local decode of the type. MRET should not be allowed in system mode. In general, I think that current version does not mark system level instructions and access to the system and machine mode CSRs as invalid in U mode. So some masking would be required on the decode level in future. Some more flags needs to be added into enum InstructionFlags to allow that checking and flags_to_check and it should then be updated on mode transition.

The behavior of xRET instructions is described in 3.1.6.1. Privilege and Global Interrupt-Enable Stack in mstatus register. When SRET is executed in M mode then it executes the same as in the S mode but it
should clear MPRV=0. This is to allow emulate some system level operations in machine level code.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

It seems that TLBs are updated from the start of the system. The TLB and its updates should be enable only when root register is set. And they should not be updated in M mode at all.

@jdupak
Copy link
Collaborator

jdupak commented Oct 29, 2025

There is one actual issue from CI: you cannot use ftruncate. It fails compilation on Win.

@jdupak
Copy link
Collaborator

jdupak commented Oct 30, 2025

There is one actual issue from CI: you cannot use ftruncate. It fails compilation on Win.

Never mind, this is broken on master. I will fix that. It does not block this PR.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 3 times, most recently from c846a9e to 442a091 Compare November 2, 2025 16:54
@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 861f836 to b5940e8 Compare November 14, 2025 12:11
Copy link
Member

@ppisa ppisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comments and documented some which has been already expressed in discussion.

I have noticed some problems in rv32ui-p-fence_i official RISC-V tests. It is in cached variant regardless of pipeline/single-cycle and 32/64/bits variants. @jdupak it is strange that the failure of given/single official test does not propagate to failure of whole test series.

Problem seems to appear in some change after Machine: add supervisor CSRs and status handling commit or it could be introduced by my rearrangement of the changes.

rv32ui-p-fence_i: ERROR
[INFO]  machine.ProgramLoader:	Loaded executable: 32bit
[INFO]  machine.TLB:	TLB[I] constructed; sets=16 way=1
[INFO]  machine.TLB:	TLB[D] constructed; sets=16 way=1
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.BranchPredictor:	Initialized branch predictor: None
[INFO]  machine.TLB:	TLB[D]: flushed all entries
[INFO]  machine.TLB:	TLB[I]: flushed all entries
[DEBUG] machine.core:	Exception cause 11 instruction PC 0x80000180 next PC 0x80000184 jump branch PC 0x8000017cregisters PC 0x80000184 mem ref 0x00000000

Machine stopped on ECALL_M exception.

mem = machine->cache_data();
}
} else {
if (access_through_cache == 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed enum and when I think about use there could be interesting to to have option to look to memory at virtual level even when CPU is in machine mode, because the you can observe what hypervisor or SBI does with some or system memory

enum MemoryAccessAtLevel {
        MEM_ACC_AS_CPU = 0,
        MEM_ACC_VIRT_ADDR = 1,
        MEM_ACC_PHYS_ADDR = 2,
        MEM_ACC_PHYS_ADDR_SKIP_CACHES = 3,
        MEM_ACC_AS_MACHINE = 4,
   };

}
if (auto data_tlb = dynamic_cast<TLB *>(mem_data)) {
data_tlb->on_privilege_changed(restored);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynamic cast are the last resort and the core should know (ideally) nothing about TLB except some control instructions to commands propagation.

One option is to add standard (synchronous) signal emit at set_current_privilege in the Core (it is QObject) and interconnect this signal to TLBs.

But when I think about it, then the right solution is to modify memory access FrontendMemory::write_ctl and FrontendMemory::read_ctl to propagate some control signals. Probably by pointer which can be null or may be with default parameters when passed by value (some struct which fits into 32 bits or uint in such case). These additional signals should propagate the privilege level and asid. This is how it is done o real CPUs. I.e., when the processor chips exposed bus to external MMU (68020) or when the buses are routed into FPGA fabric today. The control signals should be privilege level and current ASID. ASID should be held in core state and synchronized by some signal from CSR writes...

The TLB::on_privilege_changed should not be needed and for sure it should not flush TLB entries. It would cause extreme overhead for system calls and machine exceptions. The TLB flushes are maintained by operation system when page tables are modified or there is change of mapping of memory contexts to ASIDs. Seven switch to other memory context does not need the flush when ASIDs are unique.


#include "common/logging.h"
#include "execute/alu.h"
#include "memory/tlb/tlb.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TLB integration has to be solved such way that internal core logic does not need know how it works and how it is implemented. Same for tests etc.

TLBType type;
const TLBConfig tlb_config;
uint32_t current_satp_raw = 0;
CSR::PrivilegeLevel current_priv_ = CSR::PrivilegeLevel::MACHINE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic should be solved such way, that this field is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be use for keeping some last access privilege level and ASID or something similar for visualization purposes. But for sure not for real work.

namespace machine {

inline bool is_mode_enabled_in_satp(uint32_t satp_raw) {
return (satp_raw & (1u << 31)) != 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like this inline there. It should go probably into TLB header.

return InstructionFlags(flags_to_check);
}

static CSR::PrivilegeLevel decode_xret_type_from_inst(const Instruction &inst) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be solved some other way. I would suggest to not solve this at decode level at all and left decision on ControlState::read and write or at least to the memory stage where illegal-instruction exception exception would be raised. It cannot be through standard exception signal from CSR, it has to be too late. It has to be by return value or some other way, optional pointer to status return. The illegal-instruction exception should be raised even if write to read only register is attempted and even when non-existent registers is addressed. All these information cannot be gathered at decode state. It would cost too much.

There is related discussion about RISC-V standard, which allows some situations where accesses to non existent/unspecified CSRs are reserved, but conclusion is that it should result in illegal-instruction as well except for some exotic arrangements

riscv/riscv-isa-manual#1116

// Mark illegal if current privilege is lower than encoded xRET type (e.g. MRET executed in S-mode)
if (state.current_privilege() < inst_xret_priv) {
excause = EXCAUSE_INSN_ILLEGAL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should not be needed. Should be solved by

const InstructionFlags check_inst_flags_val;
const InstructionFlags check_inst_flags_mask;

manipulation in set_current_privilege and snntation of the instructions by required mode IMF_PRIV_S, IMF_PRIV_H, IMF_PRIV_M in decoding tables.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your detailed feedback. I appreciate it and have made some changes based on your review. I would be grateful for any further feedback.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from b5940e8 to 0fc627f Compare November 19, 2025 19:40
Copy link
Member

@ppisa ppisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updated version and work. In general it looks well except delivery of AccessMode along the way of address through the hierarchy.

@jdupak has updated master to use libelfin which passes even Windows tests now. So the TLB work should be rebased on that. He can help with doing tat with formatting so decide if he should do rebase the first and then you attempt tor resolve AccessMode or if that is processed vice-versa.

[[nodiscard]] RegisterValue read_ctl(enum AccessControl ctl, Address source) const;
[[nodiscard]] RegisterValue read_ctl(enum AccessControl ctl, Address source, uint32_t ctrl_info = 0) const;

virtual void handle_control_signal(uint32_t ctrl_info) { Q_UNUSED(ctrl_info); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that control signals should not be processed by separate function, they should be passed along through the memory front-ends hierarchy. So there should be no handle_control_signal method.

bool write_uXX(Address address, uintXX_t value, AccessEffects type = ae::REGULAR)

uintXX_t read_uXX(Address address, AccessEffects type = ae::REGULAR) const;

and related templates

template<typename T> T read_generic(Address address, AccessEffects type) const;

template<typename T> bool write_generic(Address address, T value, AccessEffects type);
should be updated and ctrl_info should be added afteraddress or before address parameter.

ctrl_info should have some defined type. It can be realized internally by uint32_t or something similar to make it optimally mapped. But may it be fields with specified bit-sizes are enough. The It would worth to keep whole size at most at 32 bits.

The type should hold ASID, 16 bits in maximum even for sv39 and sv48. The privilege level, 2 bits and uncached, 1 bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type name could be something like AccessMode.

[[nodiscard]] virtual LocationStatus location_status(Address address) const;
[[nodiscard]] virtual uint32_t get_change_counter() const = 0;

static inline CSR::PrivilegeLevel unpack_priv(uint32_t token) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These operations should be methods of AccessMode type.

}

void TLB::handle_control_signal(uint32_t ctrl_info) {
CSR::PrivilegeLevel priv = unpack_priv(ctrl_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be needed at the end. The AccessMode reaches TLB thorugh read and write.

QString::number(ctl));
}
}
handle_control_signal(ctrl_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be required there. ctrl_info has to be propagated into functions to realize access. I would suggest to add parameter ctrl_info or better AccesMode mode before value to keep order addeess, mode consistent for read and write.

QString::number(ctl));
}
}
const_cast<FrontendMemory *>(this)->handle_control_signal(ctrl_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This location has effect for correct accesses because they return from the function as result of switch processing. You need to switch TLB mode before access even if it should be done by this separate function. But I discourage this. Mode signals should continue along address and data through hierarchy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mode should be delivered to

Address TLB::translate_virtual_to_physical(Address vaddr)

in TLB case from

    WriteResult write(Address dst, const void *src, size_t sz, WriteOptions opts) override {
        return translate_and_write(dst, src, sz, opts);
    }
    ReadResult read(void *dst, Address src, size_t sz, ReadOptions opts) const override {
        return const_cast<TLB *>(this)->translate_and_read(dst, src, sz, opts);
    }

The translate_virtual_to_physical should check permissions as well from the mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your detailed feedback. I think it would be better if @jdupak did rebase first, and then I would make other changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nemecad Rebase done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the rebase. I added the AddressWithMode type and made the required modifications.  I would be grateful for any further feedback.

@ppisa
Copy link
Member

ppisa commented Dec 3, 2025

Then there is another option and to define type AddressWithMode which would combine and inherit from Address and AccessMode and that way it can be passed through front-end hierarchy with minimal modifications and some automatic promotion is made it can be even assigned by Address. @jdupak do you have opinion about this? But it is questionable what should be defined as AccessMode in such case, so it is better if that causes errors. Because from point of memory viewers it is best to have the highest privilege but then it automatically disables translation. So probably current mode should be copied into AddressWithMode from core state in memory view. But that is only relevant in MEM_ACC_AS_CPU, in other cases M mode is OK a cache disable is not required because it is controlled by through cache options etc...

@ppisa
Copy link
Member

ppisa commented Dec 4, 2025

When I think about CRS access privileges, then it is question if the register number to privilege should be resolved in Instruction::flags_alu_op_mem_ctl at all. The standard defines, that unimplemented CSR registers access should result in unimplemented instruction exception. This means that this exception should be reported directly by control_state->read in Core::decode and or control_state->write in Core::memory. It can be done directly by read/write function through signal or by return value. It is already implemented by C++ exception in

RegisterValue ControlState::read(Address address, PrivilegeLevel current_priv) const
void ControlState::write(Address address, RegisterValue value, PrivilegeLevel current_priv)
size_t ControlState::get_register_internal_id(Address address)

It should be changed to emulated core exception in the corresponding instruction. It can be done by returning information about CSR access error to decode and memory stages and converted to exception there or it can be solved by caching of C++ exception there and converting it to exception in matching instruction.

When the code is rebased and some prototype of AddressWithMode or other change is prepared, we can meet and discuss that together to find the right solution for CSR and finalize privilege effects in TLB.

jdupak and others added 5 commits December 6, 2025 16:25
Add tracking of the hart's current privilege level to the core state so code
handling exceptions/returns and visualization can read/update it from the
central CoreState structure.
The next supervisor CSRs has been added:
  sstatus, stvec, sscratch, sepc, scause, stval, satp
Write handler has been added as well. It presents sstatus
as a masked view of mstatus so supervisor-visible bits stay
in sync.
@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 2 times, most recently from 57a04ab to 0b179b0 Compare December 6, 2025 16:12
@jdupak
Copy link
Collaborator

jdupak commented Dec 6, 2025

Then there is another option and to define type AddressWithMode which would combine and inherit from Address and AccessMode and that way it can be passed through front-end hierarchy with minimal modifications and some automatic promotion is made it can be even assigned by Address. @jdupak do you have opinion about this? But it is questionable what should be defined as AccessMode in such case, so it is better if that causes errors. Because from point of memory viewers it is best to have the highest privilege but then it automatically disables translation. So probably current mode should be copied into AddressWithMode from core state in memory view. But that is only relevant in MEM_ACC_AS_CPU, in other cases M mode is OK a cache disable is not required because it is controlled by through cache options etc...

Can you elaborate why is this needed? I did not follow the whole discussion.

@ppisa
Copy link
Member

ppisa commented Dec 7, 2025

Thanks for rebase

Can you elaborate why is this needed? I did not follow the whole discussion.

The TLB translation has to be skipped for M mode. In addition, the translation needs to be done with ASID matching the core ASID and for sv39 and sv48 can even result in request to disable caching for given address which should be propagated to the cache component in the frontend path. The current proposed solution for mode is to keep copy of the core state (ASID and mode) in each TLB which requires their updates. Because this state influences memory accesses, the propagation of the mode and ASID has been moved to the memory ctl operations but because it is not propagated through hierarchy with address together now, there is virtual method which caches this CTL information and updates state, at this moment it is after the access which is incorrect anyway, and requires type casting and cannot reach lower levels of the hierarchy.

On the other hand, propagation of the mode of access together with address is natural and it has been done even in historical CPUs with external MMU (see FC0,1,2 signals on the slides 17 and 18 of my presentation m68k Classic CISC Architecture). When the mode and ASID is passed together with address, then there is no need for ASID and mode state copy in TLBs. It field could be kept there for keeping information for visualization of the last translation, but not for actual translation. If we consider today hyper-threaded CPUs with simultaneous execution of multiple threads, then they have to pass thread number and mode together from Load/Store unit to keep track which access is for which thread or hart in RISC-V case to use correct TLB entries. So I think that this would simplify the code and make simulator more true to real CPU implementations. I have already pointed some problems of the current solution in the above code review.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 0b179b0 to 63b7406 Compare December 11, 2025 11:02
@ppisa
Copy link
Member

ppisa commented Dec 11, 2025

@nemecad Thanks for update. I like it, the core changes are minimal.

I propose to drop next change, it should not be a problem because incorrect mode would be caught in CSR access as exception which stops simulation for now. At least I think that it is so. And we resolve mapping of invalid access or privilege to optionally enabled Invalid instruction exception delivered to machine mode handler later.

So please drop next segment

src/machine/instruction.cpp

+    if (flags & IMF_CSR) {
+        machine::CSR::Address csr = csr_address();
+        uint32_t csr12 = csr.data & 0xfffu;
+        uint32_t min_bits = (csr12 >> 8) & 0x3u; // csr[9:8]
+        switch (min_bits) {
+        case 0u:
+            // User (no extra flag required)
+            break;
+        case 1u: flags = InstructionFlags(flags | IMF_PRIV_S); break;
+        case 2u: flags = InstructionFlags(flags | IMF_PRIV_H); break;
+        case 3u: flags = InstructionFlags(flags | IMF_PRIV_M); break;
+        default: break;
+        }
+    }

I would be happy if @jdupak check what is his opinion about the code state. My stance is to merge it at this stage and continue with next incremental round for privileges and with testing, some code or even RTOS porting and if time allows extending to Sv39.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 63b7406 to beb0886 Compare December 11, 2025 12:34
@ppisa
Copy link
Member

ppisa commented Dec 11, 2025

Thank for update.

I have some remarks which could be even solved latter.

I think that we should not care about fixed uncached regions in the TLB related code at all.

I am for removal of the next lines in src/machine/memory/tlb/tlb.h

+    const Address uncached_start;
+    const Address uncached_last;

and is_in_uncached_area() in src/machine/memory/tlb/tlb.cpp

+bool TLB::is_in_uncached_area(Address source) const {
+    return (source >= uncached_start && source <= uncached_last);
+}

These should not influence translation if (!should_translate || is_in_uncached_area(vaddr)) {

etc.

As for real processor, the write to SATP should not result in invalidation as done in the code

+void TLB::on_csr_write(size_t internal_id, RegisterValue val) {
+    if (internal_id != CSR::Id::SATP) return;
+    current_satp_raw = static_cast<uint32_t>(val.as_u64());

this would negate all effort to add ASIDs which allows to keep TLB entries over memory context switch. But I would keep the code in this phase, because it simplifies even implementation of tests and the first round of experiments. Then we can enhance tests and code. Flush on change from disable to enable and vice-versa can be useful for simulator users even that not match real HW even for future.

The rules and instruction variants for TLB::flush() are more complex as well to allow more fine control. But flushing everything should be safe option to cover all cases. So OK for now.

On the other hand, the flush single

void TLB::flush_single(VirtualAddress va, uint16_t asid)

should respect the definition in manual

12.2.1. Supervisor Memory-Management Fence Instruction

where SFENCE.VMA instruction is defined.

When SFENCE.VMA is invoked with rs2≠x0 the TLB entries matching the rs2 should be invalidated, but when rs2=x0 then entries should be flushed regardless of ASID. So the TLB::flush_single() should have additional flag ignore_asid or next change should be applied

- if (e.valid && e.vpn == vpn && e.asid == asid) {
+ if (e.valid && e.vpn == vpn && (e.asid == asid || asid == 0)) {

But again that is not critical for now.

So I see the removal of abundant uncached as the suggested change.

@ppisa
Copy link
Member

ppisa commented Dec 11, 2025

There should be some global pages specific handling in

Address TLB::translate_virtual_to_physical(AddressWithMode vaddr) {

the line

if (e.valid && e.vpn == vpn && e.asid == asid) {

and related. But again, that could be solved in next round.

@ppisa
Copy link
Member

ppisa commented Dec 11, 2025

The function

Address TLB::translate_virtual_to_physical(AddressWithMode vaddr)

would need argument to specify if it is read or write. Then the code should check for permissions.

Another problem to check is if the access is done to the entry, which does not have A bit set and entry has to be updated in memory PTE (page table entry) in such case. For write, additional update is required when D is not set.

A is usually set during initial load of the entry. D requires another round of lookup usually even for valid TLB entries to update PTD and PTE entries.

This can be postponed for next round.

Implements a set-associative Translation Lookaside Buffer (TLB)
with replacement policies, Page-Table Walker,
and adds SV32-specific definitions.
Add privilege level mapping to the GUI so the current
hart privilege (UNPRIV, SUPERV, HYPERV, MACHINE) is displayed
in core state visualization.
Extend NewDialog with controls for virtual memory setup,
including TLB number of sets, associativity, and replacement
policy.
Introduce new components for displaying and tracking TLB
state similar to cache. TLBViewBlock and TLBAddressBlock
render per-set and per-way TLB contents, updated on tlb_update
signals. TLBViewScene assembles these views based on associativity.
TLBDock integrates into the GUI, showing hit/miss counts, memory
accesses, stall cycles, hit rate, and speed improvement, with live updates from the TLB.
Introduce an "As CPU (VMA)" access option in the cached
access selector to render memory contents as observed
by the CPU through the frontend interface.
Add a set of small assembly tests that exercise the SV32 page-table walker,
SATP enablement and the new TLB code. The tests create a root page
table and map a virtual page at 0xC4000000, then exercise several scenarios.
The tests verify page-table walker behaviour, SATP switching and TLB
caching/flush logic. Tests were written based on the consultation.
Ensure that TLBs are only updated when the root register is set,
and disable TLB updates while running in Machine mode.
Decode MRET/SRET/URET in the decode stage, carry the return
type through the interstage registers, and pass it
to ControlState::exception_return in the memory stage.
Extend instruction metadata with privilege flags (IMF_PRIV_M/H/S)
for privileged operations and use them for masking.
@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from beb0886 to b890b3f Compare December 11, 2025 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants