Category的原理

回顾OC对象的本质，每个OC对象都含有一个isa指针，arm64 之前，isa仅仅是一个指针，保存着对象或类对象内存地址，在 arm64 架构之后，apple对isa进行了优化，变成了一个共用体 union 结构，同时使用位域来存储更多的信息。OC对象的isa指针斌不是直接指向类对象或者是元类对象的，而是需要 &ISA_MASK 通过位运算才能取到相应的地址，但是为什么要这样做。

Category的结构

在 objc-runtime-new.h 中我们可以找到分类的定义

struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    // Fields below this point are not always present on disk.
    struct property_list_t *_classProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);
};

从源码可以找到我们平时使用的categroy、对象方法、类方法、协议和属性对应的存储方式。并且分类结构体中是不存在成员变量的，因此分类中是不允许添加成员变量的。分类中添加的属性并不会帮助我们自动生成成员变量，只会生成 get set 方法的声明，需要我们自己去实现。

我们通过命令行将 People+Test.m 文件转化为c++文件 People+Test.cpp，查看其中的编译过程。

1	xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc People+Test.m

_category_t

在 People+Test.cpp 中可以看出 _category_t 结构体中，存放着类名、对象方法列表、类方法列表、协议列表以及属性列表。

struct _category_t {
    const char *name;
    struct _class_t *cls;
    const struct _method_list_t *instance_methods;
    const struct _method_list_t *class_methods;
    const struct _protocol_list_t *protocols;
    const struct _prop_list_t *properties;
};

对象方法列表

我们也可以看到 _method_list_t 类型的结构体

static struct /*_method_list_t*/ {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[3];
} _OBJC_$_CATEGORY_INSTANCE_METHODS_People_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    3,
    {{(struct objc_selector *)"test", "v16@0:8", (void *)_I_People_Test_test},
    {(struct objc_selector *)"setAge:", "v20@0:8i16", (void *)_I_People_Test_setAge_},
    {(struct objc_selector *)"age", "i16@0:8", (void *)_I_People_Test_age}}
};

我们发现这个结构体 _OBJC_$_CATEGORY_INSTANCE_METHODS_Preson_$_Test 从名称可以看出是 INSTANCE_METHODS 对象方法，并且一一对应为上面结构体内赋值。我们可以看到结构体中存储了方法占用的内存，方法数量，以及方法列表。并且从上图中找到分类中我们实现对应的对象方法，test、setAge、age三个方法

类方法列表

接下来我们发现同样为 _method_list_t 类型的类方法结构体

static struct /*_method_list_t*/ {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[1];
} _OBJC_$_CATEGORY_CLASS_METHODS_People_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    1,
    {{(struct objc_selector *)"classMethod", "v16@0:8", (void *)_C_People_Test_abc}}
};

同上面对象方法列表一样，这个我们可以看出是类方法列表结构体 _OBJC_$_CATEGORY_CLASS_METHODS_People_$_Test，同对象方法结构体相同，同样可以看到我们实现的类方法 classMethod。

协议方法列表


static const char *_OBJC_PROTOCOL_METHOD_TYPES_NSCopying [] __attribute__ ((used, section ("__DATA,__objc_const"))) =
{
    "@24@0:8^{_NSZone=}16"
};

static struct /*_method_list_t*/ {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[1];
} _OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    1,
    {{(struct objc_selector *)"copyWithZone:", "@24@0:8^{_NSZone=}16", 0}}
};

struct _protocol_t _OBJC_PROTOCOL_NSCopying __attribute__ ((used)) = {
    0,
    "NSCopying",
    0,
    (const struct method_list_t *)&_OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying,
    0,
    0,
    0,
    0,
    sizeof(_protocol_t),
    0,
    (const char **)&_OBJC_PROTOCOL_METHOD_TYPES_NSCopying
};
struct _protocol_t *_OBJC_LABEL_PROTOCOL_$_NSCopying = &_OBJC_PROTOCOL_NSCopying;

static struct /*_protocol_list_t*/ {
    long protocol_count;  // Note, this is 32/64 bit
    struct _protocol_t *super_protocols[1];
} _OBJC_CATEGORY_PROTOCOLS_$_People_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    1,
    &_OBJC_PROTOCOL_NSCopying
};

通过上述源码可以看到先将协议方法通过 _method_list_t 结构体存储，之后通过 _protocol_t 结构体存储在 _OBJC_CATEGORY_PROTOCOLS_$_People_$_Test 中同 _protocol_list_t 结构体一一对应，分别为 protocol_count 协议数量以及存储了协议方法的 _protocol_t 结构体。

属性列表

static struct /*_prop_list_t*/ {
    unsigned int entsize;  // sizeof(struct _prop_t)
    unsigned int count_of_properties;
    struct _prop_t prop_list[1];
} _OBJC_$_PROP_LIST_People_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_prop_t),
    1,
    {{"age","Ti,N"}}
};

属性列表结构体 _OBJC_$_PROP_LIST_People_$_Test 同 _prop_list_t 结构体对应，存储属性的占用空间、属性属性数量以及属性列表，从上图中可以看到我们自己写的age属性。

OBJC$_CATEGORY_People_$_Test

对比一下 _category_t 结构体的实现

struct _category_t {
    const char *name;
    struct _class_t *cls;
    const struct _method_list_t *instance_methods;
    const struct _method_list_t *class_methods;
    const struct _protocol_list_t *protocols;
    const struct _prop_list_t *properties;
};

extern "C" __declspec(dllimport) struct _class_t OBJC_CLASS_$_People;

static struct _category_t _OBJC_$_CATEGORY_People_$_Test __attribute__ ((used, section ("__DATA,__objc_const"))) =
{
    "People",
    0, // &OBJC_CLASS_$_People,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_People_$_Test,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_CLASS_METHODS_People_$_Test,
    (const struct _protocol_list_t *)&_OBJC_CATEGORY_PROTOCOLS_$_People_$_Test,
    (const struct _prop_list_t *)&_OBJC_$_PROP_LIST_People_$_Test,
};
static void OBJC_CATEGORY_SETUP_$_People_$_Test(void ) {
    _OBJC_$_CATEGORY_People_$_Test.cls = &OBJC_CLASS_$_People;
}

上下一一对应，并且我们看到定义 _class_t 类型的 OBJC_CLASS_$_People 结构体，最后将 _OBJC_$_CATEGORY_People_$_Test 的 cls 指针指向 OBJC_CLASS_$_People 结构体地址，cls 指针指向的应该是分类的主类类对象的地址。

分类在运行时的操作

通过分析我们发现分类源码中是将我们定义的对象方法、类方法、属性等都存放在 catagory_t 结构体中。接下来我们在回到 runtime 源码查看方法、类方法、属性等是如何存储在类对象中的。

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;

    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

我们找到 _read_images 函数，找到加载分类相关的代码

// Discover categories.
for (EACH_HEADER) {
    category_t **catlist =
        _getObjc2CategoryList(hi, &count);
    bool hasClassProperties = hi->info()->hasCategoryClassProperties();

    for (i = 0; i < count; i++) {
        category_t *cat = catlist[i];
        Class cls = remapClass(cat->cls);

        if (!cls) {
            // Category's target class is missing (probably weak-linked).
            // Disavow any knowledge of this category.
            catlist[i] = nil;
            if (PrintConnecting) {
                _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                "missing weak-linked target class",
                                cat->name, cat);
            }
            continue;
        }

        // Process this category.
        // First, register the category with its target class.
        // Then, rebuild the class's method lists (etc) if
        // the class is realized.
        bool classExists = NO;
        if (cat->instanceMethods ||  cat->protocols  
            ||  cat->instanceProperties)
        {
            addUnattachedCategoryForClass(cat, cls, hi);
            if (cls->isRealized()) {
                remethodizeClass(cls);
                classExists = YES;
            }
            if (PrintConnecting) {
                _objc_inform("CLASS: found category -%s(%s) %s",
                                cls->nameForLogging(), cat->name,
                                classExists ? "on existing class" : "");
            }
        }

        if (cat->classMethods  ||  cat->protocols  
            ||  (hasClassProperties && cat->_classProperties))
        {
            addUnattachedCategoryForClass(cat, cls->ISA(), hi);
            if (cls->ISA()->isRealized()) {
                remethodizeClass(cls->ISA());
            }
            if (PrintConnecting) {
                _objc_inform("CLASS: found category +%s(%s)",
                                cls->nameForLogging(), cat->name);
            }
        }
    }
}

这段代码是用来查找是否有分类的。通过 _getObjc2CategoryList 函数获取到分类列表之后，进行遍历，获取其中的方法、协议、属性等。可以看到最终都调用了 remethodizeClass(cls) 函数，我们来到 remethodizeClass(cls)函数内部查看。

/***********************************************************************
* remethodizeClass
* Attach outstanding categories to an existing class.
* Fixes up cls's method list, protocol list, and property list.
* Updates method caches for cls and its subclasses.
* Locking: runtimeLock must be held by the caller
**********************************************************************/
static void remethodizeClass(Class cls)
{
    category_list *cats;
    bool isMeta;

    runtimeLock.assertLocked();

    isMeta = cls->isMetaClass();

    // Re-methodizing: check for more categories
    if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) {
        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s",
                         cls->nameForLogging(), isMeta ? "(meta)" : "");
        }

        attachCategories(cls, cats, true /*flush caches*/);
        free(cats);
    }
}

通过代码我们发现 attachCategories 函数接收了类对象 cls 和分类数组 cats，如我们一开始写的代码所示，一个类可以有多个分类。之前我们说到分类信息存储在 category_t 结构体中，那么多个分类则保存在 category_list 中。

我们再看 attachCategories 函数

// Attach method lists and properties and protocols from categories to a class.
// Assumes the categories in cats are all loaded and sorted by load order,
// oldest categories first.
static void
attachCategories(Class cls, category_list *cats, bool flush_caches)
{
    if (!cats) return;
    if (PrintReplacedMethods) printReplacements(cls, cats);

    bool isMeta = cls->isMetaClass();

    // fixme rearrange to remove these intermediate allocations
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists)); // 根据方法列表分配内存
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists)); // 根据属性列表分配内存
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists)); // 根据协议列表分配内存

    // Count backwards through cats to get newest categories first
    int mcount = 0;
    int propcount = 0;
    int protocount = 0;
    int i = cats->count;
    bool fromBundle = NO;
    while (i--) {
        auto& entry = cats->list[i]; // 遍历每个分类

        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) { // 将所有分类中的所有方法存入mlists
            mlists[mcount++] = mlist;
            fromBundle |= entry.hi->isBundle();
        }

        property_list_t *proplist =
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) { // 将所有分类中的所有属性存入proplist
            proplists[propcount++] = proplist;
        }

        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) { // 将所有分类中的所有协议存入protolist
            protolists[protocount++] = protolist;
        }
    }

    // rw : class_rw_t 结构体，class结构体中用来存储对象方法、属性、协议的结构体
    auto rw = cls->data();

    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    rw->methods.attachLists(mlists, mcount); // 将mlists传入rw->methods的attachLists函数，之后释放
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);

    rw->properties.attachLists(proplists, propcount); // 将properties传入rw->properties的attachLists函数，之后释放
    free(proplists);

    rw->protocols.attachLists(protolists, protocount); // 将protocols传入rw->protocols的attachLists函数，之后释放
    free(protolists);
}

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;
            // array()->lists : 原来的列表数组
            // addedCount : 分类的列表数组
            memmove(array()->lists + addedCount, array()->lists,
                    oldCount * sizeof(array()->lists[0])); // memmove : 内存移动，将array()->lists的内存移动oldCount * sizeof(array()->lists[0])个内存到array()->lists + addedCount中
            memcpy(array()->lists, addedLists,
                   addedCount * sizeof(array()->lists[0])); // memcpy : 内存复制，将addedCount复制addedCount * sizeof(array()->lists[0])个内存到array()->lists中
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        }
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists,
                   addedCount * sizeof(array()->lists[0]));
        }
    }

attachLists 函数中最重要的两个方法为 memmove 内存移动和 memcpy 内存拷贝

// memmove ：内存移动。
/*  __dst : 移动内存的目的地
*   __src : 被移动的内存首地址
*   __len : 被移动的内存长度
*   将__src的内存移动__len块内存到__dst中
*/
void *memmove(void *__dst, const void *__src, size_t __len);

// memcpy ：内存拷贝。
/*  __dst : 拷贝内存的拷贝目的地
*   __src : 被拷贝的内存首地址
*   __n : 被移动的内存长度
*   将__src的内存拷贝__n块内存到__dst中
*/
void *memcpy(void *__dst, const void *__src, size_t __n);

// array()->lists 原来方法、属性、协议列表数组
// addedCount 分类数组长度
// oldCount * sizeof(array()->lists[0]) 原来数组占据的空间
memmove(array()->lists + addedCount, array()->lists,
                  oldCount * sizeof(array()->lists[0]));

经过memmove方法之后，我们发现，虽然本类的方法，属性，协议列表会分别后移，但是本类的对应数组的指针依然指向原始位置。

// array()->lists 原来方法、属性、协议列表数组
// addedLists 分类方法、属性、协议列表数组
// addedCount * sizeof(array()->lists[0]) 原来数组占据的空间
memcpy(array()->lists, addedLists,
               addedCount * sizeof(array()->lists[0]));

我们发现原来指针并没有改变，至始至终指向开头的位置。并且经过 memmove 和 memcpy 方法之后，分类的方法，属性，协议列表被放在了类对象中原本存储的方法，属性，协议列表前面。这样做的目的是为了保证分类方法优先调用，我们知道当分类重写本类的方法时，会覆盖本类的方法。其实经过上面的分析我们知道本质上并不是覆盖，而是优先调用。本类的方法依然在内存中的。我们可以通过打印所有类的所有方法名来查看

- (void)printMethodNamesOfClass:(Class)cls {
    unsigned int count;
    // 获得方法数组
    Method *methodList = class_copyMethodList(cls, &count);
    // 存储方法名
    NSMutableString *methodNames = [NSMutableString string];
    // 遍历所有的方法
    for (int i = 0; i < count; i++) {
        // 获得方法
        Method method = methodList[i];
        // 获得方法名
        NSString *methodName = NSStringFromSelector(method_getName(method));
        // 拼接方法名
        [methodNames appendString:methodName];
        [methodNames appendString:@", "];
    }
    // 释放
    free(methodList);
    // 打印方法名
    NSLog(@"%@ - %@", cls, methodNames);
}

- (void)viewDidLoad {
    [super viewDidLoad];
    People *p = [[People alloc] init];
    [p run];
    [self printMethodNamesOfClass:[People class]];
}

经过以上代码我们会发现输出了两次 run 方法

总结

分类的实现原理是将 category 中的方法、属性、协议数据放在 category_t 结构体中，然后将结构体内的方法列表拷贝到类对象的方法列表中。category 可以添加属性，但是并不会自动生成成员变量及 set/get 方法。因为 category_t 结构体中并不存在成员变量。通过之前对对象的分析我们知道成员变量是存放在实例对象中的，并且编译的那一刻就已经布局完成。而分类是在运行时才去加载的。那么我们就无法再程序运行时将分类的成员变量中添加到实例对象的结构体中。因此分类中不可以添加成员变量。