Lexer的设计--中(4)

设计一个小型的内存池以及链表

上一节撸到万事俱备只欠真正的lex, 但是lex的作用是将源代码转化为Token流, 用什么保存Token? 这就涉及到我们要接触的第一个数据结构—链表, 虽然标准库中很多容器都可以承担链表的任务, 但是我说过出于锻炼原因, 我会尽量不使用stl中的容器, 所以我决定自己撸一个链表出来, 既然之后大多数的容器都要自己撸, 干脆连内存池也一并撸一个出来, 所以这一节的两个目的就是 : 内存池以及基于这个内存池的链表.

我个人接触内存池的时间不长, 准确的说我目前脑袋里面只有两中内存池的思路, 一种是sgi stl内存池,我之前有一篇文章专门对这种内存池做过讲解, 但是这里我会使用另外一种较为简易的内存池模型, 因为我认为sgi stl准确来说可以支持频繁地不同大小的内存分配, 而我这里会使用一种简化的思路, 使之每一个内存池只支持单一大小的内存分配. 这种内存池也是从别人那里学过来的, 简图差不多是这样.

然后源代码差不多是这样...

#ifndef FRED_MEMORYPOOL_H
#define FRED_MEMORYPOOL_H

#include <cstdio>
#include <cstdlib>
#include <vector>

template <typename T, size_t NumberForOneNode = 32>
class MemoryPool{
private:
    struct node{
        void* space;
        node* next;
    };
    node *head, *tail;
    size_t left;
    void* cur;

protected:
    MemoryPool():left(NumberForOneNode){
        tail = head = new node;
        head->next = 0;
        cur = head->space = static_cast<T*>(malloc(sizeof(T) * NumberForOneNode));
    }

    //Big three
    MemoryPool(const MemoryPool&) = delete;
    MemoryPool& operator=(const MemoryPool& rhs) = delete;
    ~MemoryPool();

    void* allocate();
};

template <typename T, size_t NumberForOneNode>
MemoryPool<T, NumberForOneNode>::~MemoryPool() {
    while(true) {
        if (head == tail) {
            free(head->space);
            delete head;
            return;
        }
        auto temp = head;
        head = head->next;
        free(temp->space);
        delete temp;
    }
}

template <typename T, size_t NumberForOneNode>
void* MemoryPool<T, NumberForOneNode>::allocate() {
    if(left--){
        auto re = cur;
        cur = reinterpret_cast<char*>(cur) +sizeof(T);
        return re;
    }

    left = NumberForOneNode;
    auto newNode = new node;
    newNode->next = 0;
    tail = tail->next = newNode;
    cur = newNode->space = static_cast<T*>(malloc(sizeof(T) * NumberForOneNode));
    allocate();
}

#endif //FRED_MEMORYPOOL_H

图上的变量和代码里面的变量名字都是统一的, 很好理解...

这个内存池的最后一步, 是在这个内存池的基础上, 再套一层封装.

#ifndef FRED_ALLOCATOR_H
#define FRED_ALLOCATOR_H


#include "MemoryPool.h"

template <typename T, size_t NumberForOneNode = 32>
class Allocator : private MemoryPool<T, NumberForOneNode> {
private:
    void* buffer[NumberForOneNode];
    size_t left;

public:
    Allocator():left(0){};

    void* allocator(){
        if(left){
            return buffer[--left];
        }
        else{
            return MemoryPool<T, NumberForOneNode>::allocate();
        }
    }

    void deallocator(T* ptr){
       ptr->~T();
       if(left == NumberForOneNode){
            //full
            return;
       }       
        buffer[left++] = ptr;
    }
};

#endif //FRED_ALLOCATOR_H

思想其实很简单, 就是如果有空间被送回, 并不直接交还给系统, 而是用这个叫做buffer的数组存着, 如果之后再有需要, 优先从数组中取, 其实这里用vector要比buffer更好, 但是如果想要重利用的空间规模不大的话, buffer也够用, 就算这个buffer也不会内存泄漏, 只是有一些空间被浪费了, 等到维护这个allocator的类卒了, 空间还是要被释放的.

如果想看这个内存池的原版本实现, 可以看这里

原版本实现

有了内存池, 根据我们的需求我们只需要一个链表.

#ifndef FRED_LINKLIST_H
#define FRED_LINKLIST_H

#include "MemoryPool/Allocator.h"

template <typename T>
class LinkList{
private:
    struct node{
        T item;
        struct node* next;
    };
    Allocator<node> allocator;
    node* head;
    node* cur;
    size_t size;

public:
    LinkList():head(0), cur(0), size(0){}
    LinkList(const LinkList&) = delete;
    LinkList& operator=(const LinkList&) = delete;
    ~LinkList(){
        for(auto temp = head; temp != cur; temp = temp->next){
            allocator.deallocator(temp);
        }
        allocator.deallocator(cur);
    }

    void pushBack(const T& item){
        if(cur) {
            cur = cur->next = reinterpret_cast<node*>(new(allocator.allocator())T(item));
        }else {
            cur = head = reinterpret_cast<node*>(new(allocator.allocator())T(item));
        }
        ++size;
        cur->next = 0;
    }

    node* getHead() const {
        return head;
    }
};

#endif //FRED_LINKLIST_H

对于这个链表功能和实现都很简单, 这里唯一要说可能有些人不知道new可以指定空间进行初始化, 不知道的可以去网上看一下, 这是placement new而我们一般使用的带有内存分配的叫做plain new...

大概就是这么多, 这个礼拜在成都玩, 可能更新地比较慢...