关于嵌入式实时操作系统的实时性

嵌入式实时操作系统RTOS里实时的衡量指标到底是什么呢？1s肯定达不到实时，那需要多快呢？100ms，10ms，1ms，还是100us，10us？

还有这些指标是如何测量的呢？

一个关于1553B总线消息周期实时性指标的例子

一篇论文中关于1553B总线消息周期实时性的指标，从这个例子中可以看出，对于windows这种非实时操作系统而言，10ms的精度也很难保证。因此，实时性指标要求任务需要至少满足10ms的指标，甚至更高。

韩春慧,王煜,黄书华,许权,张珅,鲁月林. 基于BM3803的1553B总线通信软件设计 [J]. 中国空间科学技术, 2019,39(234), 05 65-72.

论文中需要完成的1553总线测试终端的消息的周期数值偏差较为严格，

对于任务1广播时间码，周期为1s，周期偏差不能超过100us，

对于任务7系统同步，周期为2s，周期偏差不能超过10 000us=10ms，如下图所示。

如果使用传统的windows+1553B-PCI板卡方案的话，不能保证以上的精度；所以，论文使用了嵌入式实时操作系统的方案，BM3803+uCOS+61580，该系统可以满足上图的精度。

下表为实际测试结果，对于任务1，采用嵌入式实时操作系统方案，周期精确度偏差平均为8us，而采用windows方案则高达13ms，超过了100us=0.1ms的精度要求。

其它任务的周期精度要求均为10ms以内，对于嵌入式实时操作系统方案，周期精度平均1.5ms，而windows则为15ms，超过了精度要求。

论文

链接：https://pan.baidu.com/s/15P6VCZqdieAlSH9Mq8anmg
提取码：o1vq

关于实时性都有哪些指标

expresslogic有一个文档Measuring RTOS Real-Time Performance，其中描述了实时性的各种指标，最后，介绍了其RTOS实时性测量软件。

https://rtos.com/wp-content/uploads/2017/10/EL_Measuring_RTOS_Real-Time_Performance.pdf

主要分为两部分，

一是中断处理实时性，主要包括以下步骤：

（1）中断当前正在执行的任务，

（2）保存当前任务上下文，

（3）开始执行中断服务程序ISR，

（4）ISR中进行一些处理，以确定需要采取的动作，

（5）保存一些中断相关的关键数据，

（6）设置一些必须的输出，

（7）确定该执行哪个任务（一般中断到来之后，需要的处理会比较多，一般中断中会处理必须的事情，剩下的处理由某个任务来处理）

（8）清除中断状态寄存器，

（9）将控制转移到要执行的任务。

二是系统服务实时性，包括

（1）在某个事件发生时调度一个任务执行，

（2）任务之间传递消息（消息队列），

（3）申明公共资源三方面（信号量等）。

TNKernel-RX/Thread-Metric/，某个操作系统使用了Thread Metric

源代码：https://github.com/msalau/TNKernel-RX/tree/master/Thread-Metric

pdf链接：https://pan.baidu.com/s/1pJH2azMJb8QNmYZwXUQpFA
提取码：t421

用户程序需要做到什么

以上都是对于RTOS来说的，那么对于用户程序需要做到什么呢？

RTOS内核里一般都会有一些关键区域critical section，在这些区域是需要关中断的，中断都屏蔽了，那么响应外界中断自然会带来延时，影响系统的实时性。

因此，对于用户程序也可以这么说，用户程序关中断的时间不超过内核关中断的时间，就能保证用户程序不会使内核实时性变差。

这里有一个文档Interrupts-II-Bringing Organization to our Code (the shared-data problem) Reference An Embedded Software Primer By David E. Simon，其中有描述内核关键区关中断的时间会影响系统的实时性。

链接：https://pan.baidu.com/s/1M3f1rdl2DHsMMpLJ1Ptb0Q
提取码：dbno

对于FreeRTOS而言，如何选择保护关键区的方式

https://forums.freertos.org/t/how-to-choose-a-critical-section-protecton-approach-taskenter-critical-or-vtasksuspendall-or-mutex/9500

How to choose a critical section protecton approach? taskENTER_CRITICAL or vTaskSuspendAll or mutex?
Kernel

yanhc519
22h
The first question is how to choose between taskENTER_CRITICAL and vTaskSuspendAll?
I have read the manual and found that taskENTER_CRITICAL will disable interrupt while vTaskSuspendAll won’t disable interrupt but only suspend the scheduler. However, I still cannot understand the rational behind API xQueueGenericSend 2. In xQueueGenericSend, the code1 between line 761 and line 897 uses taskENTER_CRITICAL to protect itself, while the code2 between line 902 and line 941 uses vTaskSuspendAll to protect itself. What is the rational behind the protection approach chosen between the code1 and the code2?

The second question is how to choose between taskENTER_CRITICAL and mutex?
In this post, it said, "You should not be using taskENTER_CRITICAL to protect something that can take 6ms, let alone 80ms. A mutex is a much better option here, as that will only block the other tasks that try to use the file system. " As regarding to my platform, it is ARM Cotext-M4 processor running at 168 MHz and has a time slice as 1 ms. So, if the critical section lasts longer than 6*time slice, this is unacceptable. I think a reasonable critical section should last shorter than 1/n * time slice. So the rational behind how to choose between taskENTER_CRITICAL and mutex maybe that

If the critical section lasts shorter than 1/n * time slice then taskENTER_CRITICAL can be used.
If the critical section lasts longer than 1/n * time slice then mutex should be used.
Is what I am understood correct? If so, then which n should be chosen in practice?

Thanks in advance!

created
22h
last reply
14h
2
replies
15
views
3
users
2
links

hs2
Hartmut Schaefer
20h
First when referring to line numbers of code you need to add the actual (FreeRTOS) version of that code :wink: But I can’t say much about the implementation of xQueueGenericSend anyway.
The concept of critical sections was recently very well explained by Richard Damon here 3 in the forum.
When using critical sections you must be aware that it affects your (worst case) interrupt response time. That means an ISR of a hardware interrupt might get delayed because all interrupts are disabled while executing code inside a critical section.
That’s the benefit of just suspending all tasks. Interrupts remain enabled and get handled by their corresponding ISRs.
There is no common rule how long ‘very short’ is. It depends on the (real-time) requirements of your application.
Both protection methods are very light weight and fast, but affect the whole application i.e. all tasks regardless of their priorities.
In general you should narrow down the scope of any protection as much as possible. So when protecting the access to some data used by only 3 tasks out of 5 in your application, it’s almost always better to use a mutex used by these 3 tasks and all other tasks are not affected.
Also mutexes take the task priorities into account to decide which of the maybe blocked other tasks get’s the mutex next when it gets released by the task currently holding it.

SOLUTION

richard-damon
14h
The basic characteristics that are used to determine between critical sections, suspending the scheduler and using a mutex are that critical sections are fast and cheap to perform, but have a wide impact as they disable interrupts. Critical sections affect interrupt latency, so I believe that one rule that FreeRTOS uses internally is that a critical section needs to have a strictly bounded execution time, and that time should be fairly short. I user code, since you know more about the requirements you may be able to relax some of these, but that is a good baseline. I personally limit critical sections to things that can be measured in small number of microseconds at most.

The distinction between stopping the scheduler and a mutex is perhas a bit more nuanced. Suspending the scheduler has wider impact, as it affects ALL tasks, and inside the section you can’t do anything that might block. It requires no ‘setup’ to create the locking agent like a mutex would. It doesn’t protect against ISR access (which is why when you suspend the schedule it sets a flag to tell the ISRs not to modify the main task lists). The disadvantage of suspending the scheduler is that it takes some time (vs just a couple of instructions for a critical section). FreeRTOS uses suspending the scheduler if the time period isn’t both strictly bounded and short.

A Mutex on the other hand only interacts with other tasks that use the same mutex, so has the least impact on the full system. I don’t think FreeRTOS uses muteness internally in the core, in part to avoid making them mandatory in a minimum configuration.

View Code

（1）taskENTER_CRITICAL，会关中断，同时支持嵌套nesting，操作方便

（2）vTaskSuspendAll，不关中断，会关调度器，操作复杂

（3）mutex，不关中断，不关调度器

对于FreeRTOS内核来说，因为（3）Mutex是可选配置项，因此，内核是不能够使用mutex的。

所以，对于FreeRTOS内核来说，只有选择（1）和（2）。

而对于（1）和（2），按照richard-damon的回复，用（1）需要保证关键区执行时间在几微秒以内（strictly bounded execution time，in small number of microseconds at most）。

而如果无法保证执行时间短且有界的，则需要使用（2）（FreeRTOS uses suspending the scheduler if the time period isn’t both strictly bounded and short.）。且在使用（2）时需要保证被保护的关键区不会阻塞（inside the section you can’t do anything that might block）。

关于这句话的进一步核实：when you suspend the schedule it sets a flag to tell the ISRs not to modify the main task lists

下面vTaskSuspendAll的时候，会对uxSchedulerSuspended+1。（这里的注释比较有意思，就是因为变量的类型是基础类型，因此不用设置关键区，为什么呢？因为是基础类型，所以不用两次move？但是++也会ldr，inc，str需要3条指令，也可能被打断呢？）

1 void vTaskSuspendAll( void )
2 {
3     /* A critical section is not required as the variable is of type
4     portBASE_TYPE. */
5     ++uxSchedulerSuspended;
6 }

uxSchedulerSuspended != 0的话，那么，上下文切换vTaskSwitchContext就不会执行。

1 void vTaskSwitchContext( void )
2 {
3     if( uxSchedulerSuspended != ( unsigned portBASE_TYPE ) pdFALSE )
4     {
5         /* The scheduler is currently suspended - do not allow a context
6         switch. */
7         xMissedYield = pdTRUE;
8     }
9     else

同时，注意到在xTaskResumeFromISR用到了这个调度器挂起变量，如果调度器挂起，那么，不能操作就绪任务列表（为什么呢？），因此，将任务放在即将就绪列表上。

 1 portBASE_TYPE xTaskResumeFromISR( xTaskHandle pxTaskToResume )
 2     {
 3     portBASE_TYPE xYieldRequired = pdFALSE;
 4     tskTCB *pxTCB;
 5     unsigned portBASE_TYPE uxSavedInterruptStatus;
 6 
 7         configASSERT( pxTaskToResume );
 8 
 9         pxTCB = ( tskTCB * ) pxTaskToResume;
10 
11         uxSavedInterruptStatus = portSET_INTERRUPT_MASK_FROM_ISR();
12         {
13             if( xTaskIsTaskSuspended( pxTCB ) == pdTRUE )
14             {
15                 traceTASK_RESUME_FROM_ISR( pxTCB );
16 
17                 if( uxSchedulerSuspended == ( unsigned portBASE_TYPE ) pdFALSE )
18                 {
19                     xYieldRequired = ( pxTCB->uxPriority >= pxCurrentTCB->uxPriority );
20                     uxListRemove(  &( pxTCB->xGenericListItem ) );
21                     prvAddTaskToReadyQueue( pxTCB );
22                 }
23                 else
24                 {
25                     /* We cannot access the delayed or ready lists, so will hold this
26                     task pending until the scheduler is resumed, at which point a
27                     yield will be performed if necessary. */
28                     vListInsertEnd( ( xList * ) &( xPendingReadyList ), &( pxTCB->xEventListItem ) );
29                 }
30             }
31         }
32         portCLEAR_INTERRUPT_MASK_FROM_ISR( uxSavedInterruptStatus );
33 
34         return xYieldRequired;
35     }

同时，在vTaskIncrementTick时，也会判断调度器是否挂起，若挂起，则记录丢失的tick数，在调取器继续时，补偿丢失的tick数。

 1 void vTaskIncrementTick( void )
 2 {
 3 tskTCB * pxTCB;
 4 
 5     /* Called by the portable layer each time a tick interrupt occurs.
 6     Increments the tick then checks to see if the new tick value will cause any
 7     tasks to be unblocked. */
 8     traceTASK_INCREMENT_TICK( xTickCount );
 9     if( uxSchedulerSuspended == ( unsigned portBASE_TYPE ) pdFALSE )
10     {
11         ++xTickCount;
12 
13     }
14     else
15     {
16         ++uxMissedTicks;
17 
18         /* The tick hook gets called at regular intervals, even if the
19         scheduler is locked. */
20         #if ( configUSE_TICK_HOOK == 1 )
21         {
22             vApplicationTickHook();
23         }
24         #endif
25     }
26 }

在xTaskResumeAll时，补偿丢失的tick数。

 1 signed portBASE_TYPE xTaskResumeAll( void )
 2 {
 3 
 4     /* It is possible that an ISR caused a task（只有ISR会引起） to be removed from an event
 5     list while the scheduler was suspended.  If this was the case then the
 6     removed task will have been added to the xPendingReadyList.  Once the
 7     scheduler has been resumed it is safe（？） to move all the pending ready
 8     tasks from this list into their appropriate ready list. */
 9     taskENTER_CRITICAL();
10     {
11         --uxSchedulerSuspended;
12 
13         if( uxSchedulerSuspended == ( unsigned portBASE_TYPE ) pdFALSE )
14         {
15             if( uxCurrentNumberOfTasks > ( unsigned portBASE_TYPE ) 0U )
16             {
17                 portBASE_TYPE xYieldRequired = pdFALSE;
18 
19                 /* If any ticks occurred while the scheduler was suspended then
20                 they should be processed now.  This ensures the tick count does not
21                 slip, and that any delayed tasks are resumed at the correct time. */
22                 if( uxMissedTicks > ( unsigned portBASE_TYPE ) 0U )
23                 {
24                     while( uxMissedTicks > ( unsigned portBASE_TYPE ) 0U )
25                     {
26                         vTaskIncrementTick();
27                         --uxMissedTicks;
28                     }
29                 }
31             }
32         }
33     }
34     taskEXIT_CRITICAL();
35 
36     return xAlreadyYielded;
37 }