Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clock line of SPIF is dipping when you use manual chip select and you mounted an SD #226

Open
lefebvresam opened this issue Jan 27, 2024 · 7 comments
Labels
Bug Dis is broken

Comments

@lefebvresam
Copy link

lefebvresam commented Jan 27, 2024

When you mount/unmount an SD with following config:

"sd.SPI_CS": "SD_CS",
"sd.SPI_MOSI": "SD_MOSI",
"sd.SPI_MISO": "SD_MISO",
"sd.SPI_CLK": "SD_CLK",
"sd.TRX_FREQUENCY": 40000000,
/**** SD card pins ****/ //SPI2
SD_MOSI = PB_15,
SD_MISO = PB_14,
SD_CLK = PB_13,
SD_CS = PB_12,
SDBlockDevice sd;
FATFileSystem fs("fs");
fs.mount(&sd);
fs.unmount();

And you use SPIF with this config:

"spif-driver.SPI_CS": "SPIF_CS",
"spif-driver.SPI_MOSI": "SPIF_MOSI",
"spif-driver.SPI_MISO": "SPIF_MISO",
"spif-driver.SPI_CLK": "SPIF_CLK",            
"spif-driver.SPI_FREQ": 40000000   
/**** SPIF pins ****/ //SPI3
SPIF_MOSI = PC_12,
SPIF_MISO = PC_11,
SPIF_CLK = PC_10,
SPIF_CS = PA_15_ALT0,

And you use legacy code which is not using HW chipselect:

_select(true); // set CS 0
_spiwrite(0x80);            // RS:1 (Cmd/Status), RW:0 (Write)
_spiwrite(command);
if (data <= 0xFF) {   // only if in the valid range
    _spiwrite(0x00);
    _spiwrite(data);
}
_select(false); // set CS 1

During the first SPI cycle there is a unwanted dip in the clock which misconfigures the slave device. This has as result that all the following data that you want to read will be read as zero.

New15

If you remove the lines:

//  fs.mount(&sd);
//  fs.unmount();

Then you have a correct start cycle and the SPI slave is readout correctly:

New16

@lefebvresam lefebvresam added the Bug Dis is broken label Jan 27, 2024
@multiplemonomials
Copy link
Collaborator

multiplemonomials commented Jan 30, 2024

Hmm, this could be the classic issue: when you do the first transfer to an SPI device, if the SPI bus had been in a different mode, it will change the mode (e.g. mode 3 to mode 0), which can cause the clock line to change state (e.g. high to low). This is why we have that warning in the SPI docs about not using a GPIO to control the chip select -- it's an easy way to get yourself in trouble because the chip might get selected before the clock line is ready.

Does the issue still happen if you change the code to pass the chip select pin to SPI as a gpio ssel pin? The SD card driver built in to Mbed does support and use that mode.

@lefebvresam
Copy link
Author

lefebvresam commented Jan 30, 2024

To clarify this issue is on the SPIF and not on the SD. But let me do a test to use the HW chipselect. I'm convinced that the best practice is to transfer the cs control to the SPI module and use 16 bit transfers instead of two times 8 bit transfers. But the point is that a lot of legacy libs on the internet still use SW chipselect and are not optimal. What worked before is for that reason no longer working and It could give a lot of poeple unwanted stress. So in the first place a dip on the clock should not happen when you switch mode. But indead if you switch from active high clock to active low clock this is almost insurmountable. The switch should be done before the SW cs and you have to init again before using SPI SW line. The strange thing is that the SD SPI and the SPIF are on different outputs, so how they can influence each other?

@multiplemonomials
Copy link
Collaborator

Hmm, yeah, if there is only one device on the SPI peripheral, one wouldn't expect a glitch like this to happen each time. Are you sure that the SD and SPIF are on different SPI pins and different SPI peripherals (e.g. SPI2 vs SPI3)?

@lefebvresam
Copy link
Author

I suppose SPI selection is automatically done based on pin assignment. I use these pins in the instatiations:

    /**** SD card pins ****/ //SPI2
    SD_SEL = PC_6,
    SD_MOSI = PB_15,
    SD_MISO = PB_14,
    SD_CLK = PB_13,
    SD_CS = PB_12,

    /**** SPIF pins ****/ //SPI3
    SPIF_SEL = PD_2,
    SPIF_MOSI = PC_12,
    SPIF_MISO = PC_11,
    SPIF_CLK = PC_10,
    SPIF_CS = PA_15_ALT0,

@lefebvresam
Copy link
Author

Because I had this issue another time I did some further investigation of it to find the root cause of the dips in the clock line.
Having these dips invokes writing errors to the graphical chip and sometimes widgets not rendering at the right location.

I have the following:

As soon as the construction of an SPI object is used or derefered, there is an 'aquiring' mechanism to share the same SPI resource between multiple instances.

I commented out a lot of other calls to isolate the issue:

  • RA8875 lcd -> creating SPI object in constructor: , spi(mosi, miso, sclk)
  • DMAFlash dma -> where I inject the lcd object, calls _aquire in lcd construction (why?)
  • SDBlockDevice sd -> where another SPI object is constructed in constructor: _spi(mosi, miso, sclk, cs, use_gpio_ssel)
  • lcd.init() calls frequency() calls spi.format(8, 3); calls _aqcuire in SPI.cpp
  • sd.init() calls _initialize_card() calls _cmd8() calls _cmd calls _prelock_then_select calls _spi.write() calls select() calls _aquire in SPI.cpp
  • lcd.writeData() calls _spirwrite() calls _setWriteSpeed calls frequency() calls _acquire
  • A second write calls _acquire

In _acquire SPI.cpp I put std::printf("o: %u t:%u", (int)_peripheral->owner, (int)this);
The result is that the same entry in the list SPI::spi_peripheral_s is reused with _peripheral_name = GlobalSPI;

construct ra8875
o:0 t:536889908 // (call _acquire)

construct dmaflash

construct sdblock
lcd init
o:536889908 t:536890980 // (call _acquire)
sd init
o:536890980 t:536889776 // (call _acquire)
lcd data
o:536889776 t:536890980 // (call _acquire) -> here I see the dip in the clockline on the scope
o:536890980 t:536890980 // (call _acquire)

In _acuire() there is a context switch where the init is called which fills in the current pins etc and a reconfigure is happening with spi_format() and spi_frequency(). Some registers are written there which invokes transision effects on the clock line.

Next to the solution that using the ssel line as a parameter input during construction of SPI, where the ccsel is pulled low after those context switches, there is another bug I encountered.

The list is filled up with SPI devices selected from the pinmap (spi_get_peripheral_name) but only if DEVICE_SPI_COUNT is defined.

    // Need backwards compatibility with HALs not providing API
#ifdef DEVICE_SPI_COUNT
    _peripheral_name = spi_get_peripheral_name(_mosi, _miso, _sclk);
#else
    _peripheral_name = GlobalSPI;
#endif

In PeripheralNames.h of TARGET_STM32U5 this is not defined.

If I put this:

#define DEVICE_SPI_COUNT 3 // -> this is inserted
typedef enum {
    SPI_1 = (int)SPI1_BASE,
    SPI_2 = (int)SPI2_BASE,
    SPI_3 = (int)SPI3_BASE
} SPIName;

The dip is also gone because this context switch is not there anymore because different HW SPI blocks are now allocated for both interfaces and context switch chances between sd SPI and own SPI are reduced.

So I propose to create a PR for adding this line.

@multiplemonomials
Copy link
Collaborator

Nice debugging! Thank you for tracking this down! Hmm, I do have #179 open for another piece of this, which is that it's kind of dumb for SPI::_acquire() to call init() in the first place -- we should make a separate function than init which does pin muxing only. But this is definitely its own issue. I wonder how many devices have a similar problem.

@multiplemonomials
Copy link
Collaborator

OK I did a random sample of some targets, and seems like at least half the current set of target devices don't implement spi_get_peripheral_name() and don't define DEVICE_SPI_COUNT. So it seems like this is a separate issue, that a lot of the MCU vendors didn't implement this function (likely because the test suite didn't generate any errors if they didn't!),

However, this situation is a bit different, because STM does implement spi_get_peripheral_name() (original PR), but someone simply forgot to define it for STM32U5 and STM32L5. @lefebvresam Please do create a PR that adds this define for U5 and L5!

I have also opened #255 to look into this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dis is broken
Projects
None yet
Development

No branches or pull requests

2 participants