-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMA #358
DMA #358
Conversation
We came up with the following interface: class DmaHalSlotInterface
{
public:
virtual
~DmaHalSlotInterface() = default;
DmaHalSlotInterface() = default;
DmaHalSlotInterface(const DmaHalSlotInterface&) = delete;
DmaHalSlotInterface& operator=(const DmaHalSlotInterface&) = delete;
virtual void
memoryToPeripheral32(uint32_t* memory, volatile uint32_t* peripheral) = 0;
virtual void
peripheralToMemory32(volatile uint32_t* peripheral, uint32_t* memory) = 0;
virtual void
memoryToMemory32(uint32_t*, uint32_t*) = 0;
// Only on devices with with double buffering
virtual void
memoryToPeripheral32(uint32_t* memory1, uint32_t* memory2, volatile uint32_t* peripheral) = 0;
virtual void
peripheralToMemory32(volatile uint32_t* peripheral, uint32_t* memory1, uint32_t* memory2) = 0;
// Same above for data width of 8 and 16 bits
// ...
virtual void
setTransferLength(uint16_t length) = 0;
enum class
MemoryIncrementMode : uint32_t
{
Fixed = 0,
Increment = {{ reg_prefix }}_MINC, ///< incremented according to MemoryDataSize
};
enum class
PeripheralIncrementMode : uint32_t
{
Fixed = 0,
Increment = {{ reg_prefix }}_PINC, ///< incremented according to PeripheralDataSize
};
virtual void
setIncrement(PeripheralIncrementMode p, MemoryIncrementMode m) = 0;
enum class
Priority : uint8_t
{
Low = 0,
Medium = 1,
High = 2,
VeryHigh = 3,
};
virtual void
setPriority(Priority p) = 0;
enum class Mode : uint8_t {
Normal = 0,
%% if feature["circular_mode"]
CircularMode = 1,
%% endif
};
virtual void
setMode(Mode m) = 0;
struct Context
{
intptr_t memoryAddress1;
intptr_t memoryAddress2; // == 0x00000000 if double buffering is not used
volatile uint32_t* peripheralAddress;
uint16_t transferLength;
uint8_t direction : 2;
uint8_t memoryIncrementMode : 1;
uint8_t peripheralIncrementMode : 1;
uint8_t priority : 2;
};
/// waits until isFinished() (if any transfer has ever been started)
/// stores the context once another driver acquires
virtual bool
acquire(Context* context) = 0;
/// releases instantly, DMA transfer may still be active
virtual void
releaseLater() = 0;
virtual void
stop() = 0;
virtual void
start() = 0;
virtual void
reset() = 0;
virtual bool
isFinished() = 0;
enum class Peripheral {
%% for p in peripherals
{{ p }},
%% endfor
};
virtual void
connect(Peripheral p) = 0;
enum class
FeatureMap : uint8_t
{
DoubleBuffering = Bit0,
CircularMode = Bit1,
Bursts = Bit2,
};
virtual bool
isAvailable(Peripheral p, FeatureMap_t features) = 0;
}; |
Thanks for setting this up. |
Looking into it, maybe you could explain what is your idea how to use DMA e.g. for SPI? Enhance the current classes like: template <class DMA = void>
class SpiMaster1 : public modm::SpiMaster
{
...
}; and implement the methods whether DMA is a true class? Who is in charge of configuring the DMA? Would it be done inside the e.g. SPI class? And regarding the buffers, would those be static and fixed size in the DMA class or simply use the buffers passed to the transmit function? The original implementation uses names like DMA1::Stream1, what is the reason to change it to use slots? |
I remember the idea was to provide something like a dma scheduler, from which anything can request a dma slot with the necessary features. This has the advantage that the programmer does not have to read the datasheet and reference manuals to identify which dma slot is suitable. I suspect the dynamic allocation of DMA slots at runtime is not acceptable for some embedded applications, so maybe for a start we should leave the scheduler out.
From a users perspective I would like to just pass a DMA channel as template parameter to e.g. the SPI peripheral and let the peripheral handle all configuration and buffers.
Not even ST is consistent with their naming. "stream" means different things on different STM32s, other manufacturers use the term differently. We decided to use "slot" to avoid confusion. |
I'm also for reducing the work for the user and let the peripheral classes do the required work. Although I'm using SPI as example it should apply to other peripherals as well. The STM32 maps channels to specific hardware, it isn't possible to use them freely. Example: SPI1 DMA channels are either
A F4 is different as it uses streams and then channel per stream. In case of SPI1:
So the less complex implementation of DMA on the L4 makes it in my eyes perfect to have the SPI class do all the configuration and use of DMA. In that case I'd have classes DMA1 and DMA2, which do the basic initialization (e.g. enable it), then that class is passed as template to the SPI1 master and based on that it can configure the channels, interrupts etc. I'll start another project using a F7 soon, then I'll have look into that's DMA implementation. What data is available in the device files to automatically generate DMA classes? Using the buffers passed to transmit function reduces memory usage, which is good. Copying the data to/from dedicated buffers would allow to prepare the next data to be send. Not sure what is better. One use case in my project is sending data to SPI TFT. My idea is to store some images compressed with LZ4 in flash and do on-the-fly decompression when one should be displayed. Having a small buffer of e.g. 1024 bytes where the uncompressed data is placed, once it is filled, the data can already be written to the TFT. Until that is finished, the next chunk can be decompressed. Looking into the 2 data sheets it would have been better, if ST had swapped the meaning of channels and streams on the F4 (and others). |
List of all dma instances, information about streams/channels/requests and relation to peripherals. Copying data is necessary if the user e.g. passes an array created on the stack to the dma method. |
Here's a working DMA implementation for L4 SPI. It's a quick (and dirty) try. First I created a DMA class: dma1.hpp #ifndef MODM_DMA1_HPP
#define MODM_DMA1_HPP
#include <modm/platform/clock/rcc.hpp>
#include <modm/architecture/interface/interrupt.hpp>
#include <modm/architecture/interface/register.hpp>
namespace modm::platform
{
struct dma
{
enum class Alignment {
Byte = 0x00,
PeripheralHalfWord = DMA_CCR_PSIZE_0,
PeripheralWord = DMA_CCR_PSIZE_1,
MemoryHalfWord = DMA_CCR_MSIZE_0,
MemoryWord = DMA_CCR_MSIZE_1,
};
MODM_FLAGS32(Alignment);
enum class Priority {
Low = 0x00,
Medium = DMA_CCR_PL_0,
High = DMA_CCR_PL_1,
VeryHigh = DMA_CCR_PL
};
enum class Mode {
Normal = 0x00,
Circular = DMA_CCR_CIRC
};
enum class Direction {
PeripheralToMemory = 0x00,
MemoryToPeripheral = DMA_CCR_DIR,
MemoryToMemory = DMA_CCR_MEM2MEM
};
enum class IncrementMode {
Memory = DMA_CCR_MINC,
Peripheral = DMA_CCR_PINC
};
MODM_FLAGS32(IncrementMode);
enum class Channel {
Channel1 = 0x00,
Channel2,
Channel3,
Channel4,
Channel5,
Channel6,
Channel7
};
enum class Request {
Request0 = 0x00,
Request1,
Request2,
Request3,
Request4,
Request5,
Request6,
Request7
};
enum class Interrupt {
Global = 0x01,
TransferComplete = 0x02,
HalfTransferComplete = 0x04,
Error = 0x08
};
MODM_FLAGS32(Interrupt);
using IrqHandler = void (*)(void);
};
class Dma1 : public dma
{
public:
static void
enable()
{
Rcc::enable<Peripheral::Dma1>();
}
static void
disable()
{
Rcc::disable<Peripheral::Dma1>();
}
template <dma::Channel DmaChannel>
struct Channel
{
static constexpr dma::Channel channelId { DmaChannel };
static constexpr uint32_t ChannelBase { DMA1_BASE + 0x08 + uint32_t(DmaChannel) * 0x14 };
static constexpr IRQn_Type DmaIrqs[] {
DMA1_Channel1_IRQn, DMA1_Channel2_IRQn, DMA1_Channel3_IRQn,
DMA1_Channel4_IRQn, DMA1_Channel5_IRQn, DMA1_Channel6_IRQn,
DMA1_Channel7_IRQn
};
static void
enable()
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CCR |= DMA_CCR_EN;
}
static void
disable()
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CCR &= ~DMA_CCR_EN;
}
static void
configure(Direction direction, Alignment_t alignment, IncrementMode_t increment,
Priority priority, Mode mode = Mode::Normal)
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CCR = uint32_t(direction) | alignment.value | increment.value |
uint32_t(priority) | uint32_t(mode);
}
static void
setAddresses(Direction direction, uint32_t sourceAddress, uint32_t destinationAddress)
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
if (direction == Direction::MemoryToPeripheral) {
Base->CMAR = sourceAddress;
Base->CPAR = destinationAddress;
} else {
Base->CPAR = sourceAddress;
Base->CMAR = destinationAddress;
}
}
static void
setDataLength(std::size_t length)
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CNDTR = length;
}
static void
setTransferErrorIrqHandler(dma::IrqHandler irqHandler)
{
transferError = irqHandler;
}
static void
setTransferCompleteIrqHandler(dma::IrqHandler irqHandler)
{
transferComplete = irqHandler;
}
static void
enableInterrupt(Interrupt_t irq)
{
NVIC_SetPriority(DmaIrqs[uint32_t(DmaChannel)], 1);
NVIC_EnableIRQ(DmaIrqs[uint32_t(DmaChannel)]);
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CCR |= irq.value;
}
static void
disableInterrupt(Interrupt_t irq)
{
DMA_Channel_TypeDef *Base = (DMA_Channel_TypeDef *) ChannelBase;
Base->CCR &= ~(irq.value);
NVIC_DisableIRQ(DmaIrqs[uint32_t(DmaChannel)]);
}
static void
interruptHandler()
{
static const uint32_t TC_Flag { uint32_t(Interrupt::TransferComplete) << (uint32_t(DmaChannel) * 4) };
static const uint32_t TE_Flag { uint32_t(Interrupt::Error) << (uint32_t(DmaChannel) * 4) };
auto isr { DMA1->ISR };
if (isr & TE_Flag) {
disable();
if (transferError)
transferError();
}
if (isr & TC_Flag and transferComplete)
transferComplete();
DMA1->IFCR |= uint32_t(Interrupt::Global) << (uint32_t(DmaChannel) * 4);
}
static IrqHandler transferError;
static IrqHandler transferComplete;
};
template <dma::Channel dmaChannel, dma::Request dmaRequest>
static void
setPeripheralRequest()
{
DMA1_CSELR->CSELR &= ~(0x07 << (uint32_t(dmaChannel) * 4));
DMA1_CSELR->CSELR |= uint32_t(dmaRequest) << (uint32_t(dmaChannel) * 4);
}
template <Peripheral peripheral>
struct ChannelConfig {
};
};
template <dma::Channel DmaChannel>
dma::IrqHandler
Dma1::Channel<DmaChannel>::transferError(nullptr);
template <dma::Channel DmaChannel>
dma::IrqHandler
Dma1::Channel<DmaChannel>::transferComplete(nullptr);
template <>
struct Dma1::ChannelConfig<Peripheral::Spi1> {
using Rx = Dma1::Channel<dma::Channel::Channel2>;
using Tx = Dma1::Channel<dma::Channel::Channel3>;
static constexpr dma::Request RxRequest = dma::Request::Request1;
static constexpr dma::Request TxRequest = dma::Request::Request1;
};
}
#endif dma1.cpp #include <dma1.hpp>
MODM_ISR(DMA1_Channel1)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel1>::interruptHandler();
}
MODM_ISR(DMA1_Channel2)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel2>::interruptHandler();
}
MODM_ISR(DMA1_Channel3)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel3>::interruptHandler();
}
MODM_ISR(DMA1_Channel4)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel4>::interruptHandler();
}
MODM_ISR(DMA1_Channel5)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel5>::interruptHandler();
}
MODM_ISR(DMA1_Channel6)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel6>::interruptHandler();
}
MODM_ISR(DMA1_Channel7)
{
using namespace modm::platform;
Dma1::Channel<dma::Channel::Channel7>::interruptHandler();
} Then I derived a SPI DMA class: template <class DmaController>
class SpiMaster1_Dma : public modm::platform::SpiMaster1
{
using BaseClass = modm::platform::SpiMaster1;
using DmaChannelConfig = typename DmaController::template ChannelConfig<Peripheral::Spi1>;
using RxChannel = DmaChannelConfig::Rx;
using TxChannel = DmaChannelConfig::Tx;
public:
// start documentation inherited
template< class SystemClock, baudrate_t baudrate, percent_t tolerance=pct(5) >
static void
initialize()
{
RxChannel::configure(dma::Direction::PeripheralToMemory, dma::Alignment::Byte,
dma::IncrementMode::Memory, dma::Priority::Medium);
RxChannel::setAddresses(dma::Direction::PeripheralToMemory, SPI1_BASE + 0x0c,
uint32_t(dmaRxBuffer));
RxChannel::setDataLength(sizeof(dmaRxBuffer));
RxChannel::setTransferErrorIrqHandler(handleTransferError);
RxChannel::setTransferCompleteIrqHandler(handleReceiveComplete);
RxChannel::enableInterrupt(dma::Interrupt::Error | dma::Interrupt::TransferComplete);
DmaController::template setPeripheralRequest<RxChannel::channelId, DmaChannelConfig::RxRequest>();
TxChannel::configure(dma::Direction::MemoryToPeripheral, dma::Alignment::Byte,
dma::IncrementMode::Memory, dma::Priority::Medium);
TxChannel::setAddresses(dma::Direction::MemoryToPeripheral, uint32_t(dmaTxBuffer),
SPI1_BASE + 0x0c);
TxChannel::setDataLength(sizeof(dmaTxBuffer));
TxChannel::setTransferErrorIrqHandler(handleTransferError);
TxChannel::setTransferCompleteIrqHandler(handleTransmitComplete);
TxChannel::enableInterrupt(dma::Interrupt::Error | dma::Interrupt::TransferComplete);
DmaController::template setPeripheralRequest<TxChannel::channelId, DmaChannelConfig::TxRequest>();
BaseClass::initialize<SystemClock, baudrate, tolerance>();
SpiHal1::setRxFifoThreshold(SpiHal1::RxFifoThreshold::QuarterFull);
SpiHal1::enableInterrupt(SpiBase::Interrupt::TxDmaEnable | SpiBase::Interrupt::RxDmaEnable);
}
static uint8_t
transferBlocking(uint8_t data)
{
return RF_CALL_BLOCKING(transfer(data));
}
static void
transferBlocking(uint8_t *tx, uint8_t *rx, std::size_t length)
{
RF_CALL_BLOCKING(transfer(tx, rx, length));
}
static modm::ResumableResult<uint8_t>
transfer(uint8_t data)
{
// this is a manually implemented "fast resumable function"
// there is no context or nesting protection, since we don't need it.
// there are only two states encoded into 1 bit (LSB of state):
// 1. waiting to start, and
// 2. waiting to finish.
#if 0
static uint8_t rxData[2];
// LSB != Bit0 ?
if (not (state & Bit0)) {
rxData[0] = data;
// set LSB = Bit0
state |= Bit0;
}
auto result { transfer(rxData, rxData, 2) };
if (result.getState() > modm::rf::NestingError)
return {modm::rf::Running};
// transfer finished
state &= ~Bit0;
return { modm::rf::Stop, rxData[1] };
#endif
// LSB != Bit0 ?
if ( !(state & Bit0) )
{
SpiHal1::disableInterrupt(SpiBase::Interrupt::TxDmaEnable | SpiBase::Interrupt::RxDmaEnable);
// wait for previous transfer to finish
if (!SpiHal1::isTransmitRegisterEmpty())
return {modm::rf::Running};
// start transfer by copying data into register
SpiHal1::write(data);
// set LSB = Bit0
state |= Bit0;
}
if (!SpiHal1::isReceiveRegisterNotEmpty())
return {modm::rf::Running};
SpiHal1::read(data);
// transfer finished
state &= ~Bit0;
return {modm::rf::Stop, data};
}
static modm::ResumableResult<void>
transfer(uint8_t *tx, uint8_t *rx, std::size_t length)
{
// this is a manually implemented "fast resumable function"
// there is no context or nesting protection, since we don't need it.
// there are only two states encoded into 1 bit (Bit1 of state):
// 1. initialize index, and
// 2. wait for 1-byte transfer to finish.
length = std::min(length, std::min(sizeof(dmaRxBuffer), sizeof(dmaTxBuffer)));
// we are only interested in Bit1
switch(state & Bit1)
{
case 0:
// we will only visit this state once
state |= Bit1;
error = false;
SpiHal1::enableInterrupt(SpiBase::Interrupt::TxDmaEnable | SpiBase::Interrupt::RxDmaEnable);
if (tx)
std::copy(tx, tx + length, dmaTxBuffer);
else
std::fill(tx, tx + length, 0);
RxChannel::setDataLength(length);
receiveComplete = false;
RxChannel::enable();
TxChannel::setDataLength(length);
transmitComplete = false;
TxChannel::enable();
[[fallthrough]];
default:
while (true) {
if (error)
break;
if (not transmitComplete and not receiveComplete)
return { modm::rf::Running };
if (SpiHal1::getInterruptFlags() & SpiBase::InterruptFlag::FifoTxLevel)
return { modm::rf::Running };
if (SpiHal1::getInterruptFlags() & SpiBase::InterruptFlag::Busy)
return { modm::rf::Running };
if (SpiHal1::getInterruptFlags() & SpiBase::InterruptFlag::FifoRxLevel)
return { modm::rf::Running };
if (rx) {
std::copy(dmaRxBuffer, dmaRxBuffer + length, rx);
break;
}
break;
}
SpiHal1::disableInterrupt(SpiBase::Interrupt::TxDmaEnable | SpiBase::Interrupt::RxDmaEnable);
// clear the state
state &= ~Bit1;
return {modm::rf::Stop};
}
}
private:
static void
handleTransferError()
{
SpiHal1::disableInterrupt(SpiBase::Interrupt::TxDmaEnable | SpiBase::Interrupt::RxDmaEnable);
error = true;
}
static void
handleReceiveComplete()
{
RxChannel::disable();
receiveComplete = true;
}
static void
handleTransmitComplete()
{
TxChannel::disable();
transmitComplete = true;
}
static uint8_t dmaRxBuffer[64];
static uint8_t dmaTxBuffer[64];
static bool error;
static bool transmitComplete;
static bool receiveComplete;
friend DmaController;
};
template <class DmaController>
uint8_t
SpiMaster1_Dma<DmaController>::dmaRxBuffer[64];
template <class DmaController>
uint8_t
SpiMaster1_Dma<DmaController>::dmaTxBuffer[64];
template <class DmaController>
bool
SpiMaster1_Dma<DmaController>::error(false);
template <class DmaController>
bool
SpiMaster1_Dma<DmaController>::receiveComplete(false);
template <class DmaController>
bool
SpiMaster1_Dma<DmaController>::transmitComplete(false); The SpiHal1 class got one additional method: setRxFifoThreshold(RxFifoThreshold threshold)
{
SPI1->CR2 = (SPI1->CR2 & ~static_cast<uint32_t>(RxFifoThreshold::QuarterFull))
| static_cast<uint32_t>(threshold);
} This is working now. It is different from your approach, but hides the DMA from the user inside the SPI class. The only thing required would be the DMA. Also I think although ST naming is different, at the end they talk about the same. On L4 they call it channels and requests, on a F4 streams and channels. Will be out for few hours and continue later. Just wanted to share what I got so far. |
Spent some more time to separate the DMA HAL, now I'm curious which way to go? |
Your code looks good! |
Ok, I'll do it. Simply replacing the existing DMA files? Or create a v2 next to the current implementation? |
I would replace the existing dma driver. We won't keep both in modm.
I can help you with that. |
Closing this in favor of #371 |
This is a rebased version of @chris-durand and my proposal for a DMA api from two years ago.
@mikewolfram Maybe this is a base you can start with. If you like take the code and finish the DMA driver (or ignore this and write your own).
I will try to find time in the next days to write down our thoughts.