老青菜

iOS App 启动底层分析

2017-08-08

App 启动的时候,系统首先会加载 APP 的可执行文件,然后获得 dyld 所在路径,加载 dyld,接着后面的事情就交给 dyld 了。dylb 是什么呢?

Dylb

全称 the dynamic link editor,动态链接器, 源码在这里

image 镜像

可以理解为程序中对应实例,可以是可执行文件、Frameworkdylibbundle 文件。一个 App 包含很多镜像,比如:FoundationCoreServices 等等。

ImageLoader

用来加载 image 镜像的工具。

Mach-O

全称为 Mach Object,是 MAC 下可执行文件格式,类似 windows exe 文件,主要包括以下三种类型。

  1. executable 程序可执行文件。
  2. dylib 动态链接库,类似 windows dlllinux so
  3. bundle 资源文件,使用dlopen加载。

引用 wwdc2016/406

File Types:
Executable—Main binary for application
Dylib—Dynamic library (aka DSO or DLL)
Bundle—Dylib that cannot be linked, only dlopen(), e.g. plug-ins

Dylb Setup

在iOS 13前,使用的是dyld2。我们回到 App 启动,在系统内核做好程序准备工作之后,交由 dyld 负责剩下的工作,我们看下 dyld 做了哪些事情。

以上摘自 wwdc2016/406 pdf 第59页。

load dylibs

Map all dependent dylibs, recurse,递归找到所有依赖的 dylibs(动态库)

rebase

Rebase all images,调整所有镜像内的指针,添加一个 slide 偏差值,对于 slide 的解释,WWDC 是这样说的:

Slide = actual_address - preferred_address

可能是出于安全考虑,引入了 ASLR,全称 Address Space Layout Randomization

ASLR
.Address Space Layout Randomization 
.Images load at random address

大概意思就是因为 Mach-Ommap 到虚拟内存的时候,起始地址会有一个随机的偏移量 slidedyld 需要修正这个偏差,需要把内部的指针指向加上这个 slide

使用以下命令查看Mach Orebase 信息,或者用MachOView查看二进制文件的Dynamic Loader infoRebase info 模块。

xcrun dyldinfo -rebase demo.app/demo 

bind

Bind all images,查询符号表,设置指向镜像外部的指针。因为动态库不编译进最终的二进制文件中,而是在运行的时候动态的查找调用函数的地址,调用外部符号进行绑定的过程就称作 Binding,比如项目中用到 UIView,属于UIKit框架,所以需要进行绑定操作。
Bind相对于Rebase有更复杂的计算,更少的page fault

使用以下命令查看Mach Orebase 信息,或者用MachOView查看二进制文件的Dynamic Loader infoBind info 模块。

xcrun dyldinfo -bind demo.app/demo 

objc setup

objc setup,主要是 runtime 的初始化,引用 WWDC 解释:

.Most ObjC set up done via rebasing and binding
.All ObjC class definitions are registered 
.Non-fragile ivars offsets updated
.Categories are inserted into method lists 
.Selectors are uniqued

大部分 ObjC 的初始化工作已经完成,接下来注册所有的 objc class,更新 ivars 偏移(runtime 2.0 新特性,二进制兼容),把分类的方法插入到方法列表里,再检查 selector 唯一性,具体实现可以看 map_images

initializers

到了这一步,dylib 开始调用 C++ 初始化器初始胡静态对象,然后调用 class load(父类优先调用),再调用 category load,接着调用 __attribute__((constructor) 的构造函数。最后调用 main

.C++ generates initializer for statically allocated objects 
.ObjC +load methods
.Run "bottom up" so each initializer can call dylibs below it 
.Lastly, Dyld calls main() in executable

到这里,整个启动流程基本就结束了。

main

dyld setup完成之后,接着就会调用 main(),接着就是初始化 UIApplication了。

_objc_init 补充

在启动初始化的时候,会调用 _objc_init,这里做一些准备工作,比如说加载环境变量、初始化静态构造函数,注册 镜像映射、镜像加载、镜像卸载回调等等。我们打开 objc-os.mm,找到以下代码:

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/
void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;

    // fixme defer initialization until an objc-using image is found?
    environ_init();//环境变量
    tls_init();
    static_init();//静态构造函数
    lock_init();
    exception_init();
    //注册 images 回调
    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

environ_init

读取环境配置方法,读取在 Xcode 中配置的环境变量参数。具体可以看 objc-env.h 文件。

OPTION( PrintImages,              OBJC_PRINT_IMAGES,               "log image and library names as they are loaded")
OPTION( PrintImageTimes,          OBJC_PRINT_IMAGE_TIMES,          "measure duration of image loading steps")
OPTION( PrintLoading,             OBJC_PRINT_LOAD_METHODS,         "log calls to class and category +load methods")
OPTION( PrintInitializing,        OBJC_PRINT_INITIALIZE_METHODS,   "log calls to class +initialize methods")
//省略...

tls_init

初始化线程的析构函数,具体可以看 objc-runtime.mm 文件。

void tls_init(void)
{
#if SUPPORT_DIRECT_THREAD_KEYS
    _objc_pthread_key = TLS_DIRECT_KEY;
    pthread_key_init_np(TLS_DIRECT_KEY, &_objc_pthread_destroyspecific);
#else
    _objc_pthread_key = tls_create(&_objc_pthread_destroyspecific);
#endif
}

static_init

调用 C++的静态构造函数,具体可以看 objc-os.mm 文件。

/***********************************************************************
* static_init
* Run C++ static constructor functions.
* libc calls _objc_init() before dyld would call our static constructors, 
* so we have to do it ourselves.
**********************************************************************/
static void static_init()
{
    size_t count;
    Initializer *inits = getLibobjcInitializers(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        inits[i]();
    }
}

lock_init

初始化锁,具体可以看 objc-runtime-new.mm 文件。

void lock_init(void)
{
#if SUPPORT_QOS_HACK
    BackgroundPriority = _pthread_qos_class_encode(QOS_CLASS_BACKGROUND, 0, 0);
    MainPriority = _pthread_qos_class_encode(qos_class_main(), 0, 0);
# if DEBUG
    pthread_key_init_np(QOS_KEY, &destroyQOSKey);
# endif
#endif
}

exception_init

初始化 exception handle,具体可以看 objc-exception.mm 文件。

/***********************************************************************
* exception_init
* Initialize libobjc's exception handling system.
* Called by map_images().
**********************************************************************/
void exception_init(void)
{
    old_terminate = std::set_terminate(&_objc_terminate);
}

_dyld_objc_notify_register

注册 准备镜像、加载镜像、卸载镜像的回调,每当有新的镜像被加载的时候,都会调用这些回调。

map_images

runtime 收到 dylb 的准备镜像通知 的时候,开始初始化 runtime,注册 objc class,更新 ivars offset,把 category 方法合到主类等等,打开 objc-runtime-new.mm,找到以下代码:

/***********************************************************************
* map_images
* Process the given images which are being mapped in by dyld.
* Calls ABI-agnostic code after taking ABI-specific locks.
*
* Locking: write-locks runtimeLock
**********************************************************************/
void
map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    rwlock_writer_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}

进入方法,先加锁,这里使用了读写锁,然后交给 map_images_nolock 处理。

map_images_nolock

准备镜像具体实现,实现共享内存优化,默认方法注册、自动释放池和散列表初始化及类的加载等等操作。方法比较长,截取了部分:

void 
map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
    static bool firstTime = YES;
    header_info *hList[mhCount];
    uint32_t hCount;
    size_t selrefCount = 0;
    // Perform first-time initialization if necessary.
    // This function is called before ordinary library initializers. 
    // fixme defer initialization until an objc-using image is found?
    if (firstTime) {
        //预优化初始化
        preopt_init();
    }

     //统计class 数量
    // Find all images with Objective-C metadata.
    hCount = 0;
    // Count classes. Size various table based on the total.
    int totalClasses = 0;

    // Perform one-time runtime initialization that must be deferred until 
    // the executable itself is found. This needs to be done before 
    // further initialization.
    // (The executable may not be present in this infoList if the 
    // executable does not contain Objective-C code but Objective-C 
    // is dynamically loaded later.
    if (firstTime) {
        sel_init(selrefCount);
        // 自动释放池和散列表初始化
        arr_init();

     //读取镜像
    if (hCount > 0) {
        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
    }
    firstTime = NO;
}

//自动释放池、散列表初始化
void arr_init(void) 
{
    AutoreleasePoolPage::init();
    SideTableInit();
}

最后读取镜像,调用 _read_images

_read_images

读取镜像,方法内做了很多事情,加载类、注册方法、加载虚函数表、加载协议 Protocol 和非延迟类方法、加载静态实例、加载分类。代码太多,截取了部分,具体可以看 objc-runtime-new.mm

/***********************************************************************
* _read_images
* Perform initial processing of the headers in the linked 
* list beginning with headerList. 
*
* Called by: map_images_nolock
*
* Locking: runtimeLock acquired by map_images
**********************************************************************/
void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    header_info *hi;
    uint32_t hIndex;
    size_t count;
    size_t i;
    Class *resolvedFutureClasses = nil;
     //加载类
    for (EACH_HEADER) {
        if (! mustReadClasses(hi)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->isPreoptimized();

        classref_t *classlist = _getObjc2ClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }

    //注册方法
    // Fix up @selector references
    static size_t UnfixedSelectors;
    sel_lock();
    for (EACH_HEADER) {
        if (hi->isPreoptimized()) continue;

        bool isBundle = hi->isBundle();
        SEL *sels = _getObjc2SelectorRefs(hi, &count);
        UnfixedSelectors += count;
        for (i = 0; i < count; i++) {
            const char *name = sel_cname(sels[i]);
            sels[i] = sel_registerNameNoLock(name, isBundle);
        }
    }
    sel_unlock();

    //加载虚函数表
#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    for (EACH_HEADER) {
        message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
        if (count == 0) continue;

        if (PrintVtables) {
            _objc_inform("VTABLES: repairing %zu unsupported vtable dispatch "
                         "call sites in %s", count, hi->fname());
        }
        for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }
#endif

    //加载协议
    // Discover protocols. Fix up protocol refs.
    for (EACH_HEADER) {
        extern objc_class OBJC_CLASS_$_Protocol;
        Class cls = (Class)&OBJC_CLASS_$_Protocol;
        assert(cls);
        NXMapTable *protocol_map = protocols();
        bool isPreoptimized = hi->isPreoptimized();
        bool isBundle = hi->isBundle();

        protocol_t **protolist = _getObjc2ProtocolList(hi, &count);
        for (i = 0; i < count; i++) {
            readProtocol(protolist[i], cls, protocol_map, 
                         isPreoptimized, isBundle);
        }
    }
    // Fix up @protocol references
    // Preoptimized images may have the right 
    // answer already but we don't know for sure.
    for (EACH_HEADER) {
        protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count);
        for (i = 0; i < count; i++) {
            remapProtocolRef(&protolist[i]);
        }
    }

    //重新布局class
    // Realize non-lazy classes (for +load methods and static instances)
    for (EACH_HEADER) {
        classref_t *classlist = 
            _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;

            // hack for class __ARCLite__, which didn't get this above
#if TARGET_OS_SIMULATOR
            if (cls->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->cache._mask  ||  cls->cache._occupied)) 
            {
                cls->cache._mask = 0;
                cls->cache._occupied = 0;
            }
            if (cls->ISA()->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->ISA()->cache._mask  ||  cls->ISA()->cache._occupied)) 
            {
                cls->ISA()->cache._mask = 0;
                cls->ISA()->cache._occupied = 0;
            }
#endif
            realizeClass(cls);
        }
    }

    //加载分类 category
    // Discover categories. 
    for (EACH_HEADER) {
        category_t **catlist = 
            _getObjc2CategoryList(hi, &count);
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();

        for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);

            if (!cls) {
                // Category's target class is missing (probably weak-linked).
                // Disavow any knowledge of this category.
                catlist[i] = nil;
                if (PrintConnecting) {
                    _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                 "missing weak-linked target class", 
                                 cat->name, cat);
                }
                continue;
            }

            // Process this category. 
            // First, register the category with its target class. 
            // Then, rebuild the class's method lists (etc) if 
            // the class is realized. 
            bool classExists = NO;
            if (cat->instanceMethods ||  cat->protocols  
                ||  cat->instanceProperties) 
            {
                addUnattachedCategoryForClass(cat, cls, hi);
                if (cls->isRealized()) {
                    remethodizeClass(cls);
                    classExists = YES;
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category -%s(%s) %s", 
                                 cls->nameForLogging(), cat->name, 
                                 classExists ? "on existing class" : "");
                }
            }

            if (cat->classMethods  ||  cat->protocols  
                ||  (hasClassProperties && cat->_classProperties)) 
            {
                addUnattachedCategoryForClass(cat, cls->ISA(), hi);
                if (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA());
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category +%s(%s)", 
                                 cls->nameForLogging(), cat->name);
                }
            }
        }
    }
}

到此镜像映射完成。

load_images

runtime 收到 dylb 的加载镜像通知 的时候,会调用这个方法,作用是加载镜像,打开 objc-runtime-new.mm,找到以下代码:

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        rwlock_writer_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}

不难发现 load_images 主要做了两件事,先调用 prepare_load_methods 进行 load 准备,接着调用 call_load_methods 执行所有的 load 方法。

prepare_load_methods

//load 预处理
void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;
    runtimeLock.assertWriting();
    classref_t *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    //这里先处理所有class load
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));
    }

    //再处理 category load
    category_t **categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClass(cls);
        assert(cls->ISA()->isRealized());
        //添加到 loadable_categories 全局结构体里
        add_category_to_loadable_list(cat);
    }
}

不难发现,先调用 schedule_class_load 处理 class load,然后再处理 category load。那么处理 class load,父类和子类是什么顺序呢?我们继续看代码:

//处理 class load,superclass 的在前面
static void schedule_class_load(Class cls)
{
    if (!cls) return;
    assert(cls->isRealized());  // _read_images should realize

    if (cls->data()->flags & RW_LOADED) return;

    //这里先添加 superClass 的 load
    // Ensure superclass-first ordering
    schedule_class_load(cls->superclass);

    //再添加 class load
    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}
//处理 category load
void add_category_to_loadable_list(Category cat)
{
    IMP method;
    loadMethodLock.assertLocked();
    method = _category_getLoadMethod(cat);
    if (loadable_categories_used == loadable_categories_allocated) {
        loadable_categories_allocated = loadable_categories_allocated*2 + 16;
        loadable_categories = (struct loadable_category *)
            realloc(loadable_categories,
                              loadable_categories_allocated *
                              sizeof(struct loadable_category));
    }
    loadable_categories[loadable_categories_used].cat = cat;
    loadable_categories[loadable_categories_used].method = method;
    loadable_categories_used++;
}

schedule_class_load 内部写的很清楚,优先存储 superclass load,然后再调用 add_class_to_loadable_list 存储自身的 class load。到这里 load 准备工作做完了。

call_load_methods

接下来开始调用所有的 load,我们先看源码:

//调用所有的 load 方法
void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;
    loadMethodLock.assertLocked();
    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;
    void *pool = objc_autoreleasePoolPush();
    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            //循环调用 class 的 load 方法,直到完成所有调用为止
            call_class_loads();
        }
         // 调用分类的 load 方法
        // 2. Call category +loads ONCE
        more_categories = call_category_loads();

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);
    loading = NO;
}

很明显,这里先用一个 while 循环 执行 class load,然后再调用 category load

总结

最后,最后我们总结一下:

  • 准备 load 方法的时候,先处理 superclass load,再处理子类;最后准备 category load,单独存储。
  • 调用 load 方法的时候,先调用 class load,也就是会先调用 superclass load,再调用子类的;最后调用 category load

参考链接


Optimizing App Startup Time
dyld

Tags: objc
使用支付宝打赏
使用微信打赏

若你觉得我的文章对你有帮助,欢迎点击上方按钮对我打赏

扫描二维码,分享此文章