Skip to content

FAQ索引构建架构设计

此篇为总结在pa工作时接触到的索引构建的架构设计

FAQ索引构建链路

智能问答(FAQ)是基于人工智能相关技术栈搭建的一个对话系统, 是智能客服的核心支撑模块(其余还有任务型机器人, 搜索, 推荐, 输入联想等). 其核心流程与多数检索类服务相似, 即召回+排序, 在召回阶段, FAQ会依赖多种形式的索引如ES索引, 语义索引(SM_NSG, SM_ANNOY), 预处理索引等等.

  • 索引构建与服务方式

1745408920271

  • 索引构建步骤

1745408927937

  • 在线与离线
    • es索引支持在线增量更新
    • sm索引只能全量更新, 需要将数据全量加载至内存, 因此只能离线全量更新(一般定时深夜), 不影响线上服务; 就算完善了索引切换方式也不行, 索引构建过程中(如向量化)也会对上下游服务造成较大压力, 需要一整个链路针对索引构建的流量做隔离

综上, FAQ索引构建链路有以下特征:

  1. 关联服务以及中间件较多, 导致构建过程容易失败
  2. 需要构建es, sm多种形式的索引, 只能离线构建

因此, 其架构设计有以下需求:

  1. 需要完善构建失败或异常的监控以及处理, 支持灵活的构建策略
  2. 需要保证数据一致性, 不能出现例如es构建成功, sm构建失败导致召回异常的情况
  3. 此外, 要减少构建耗时, 由于sm只能全量构建, 在语料数据较多时(>100w)构建耗时过长,且会对上下游服务造成较大的并发压力

基于产品路由的SM索引服务

如上文所述, SM索引加载至内存才能对外提供检索服务(ps: 向量数据库是另一种更佳的方案). FAQ作为对话系统的支撑模块, 常会服务于多个场景的客服机器人产品, 若按传统SAAS架构设计, 需加载所有产品的SM索引, 将占用极大的内存, 且无法拓展维护, 因此提出如下图的基于产品路由的SM索引服务

1745408936863

  1. 由调度中心维护一套产品与机器实例的映射关系, 每个索引服务实例仅需加载部分产品的内存索引, 以降低对内存资源的消耗. 如上图index-service有3台实例, 第一台实例仅需加载1, 2, 3, 4这几个产品
  2. 在索引构建时, 内存索引会被调度中心根据索引服务实例内存情况, 分配给到指定的实例资源去加载; (服务重启后接收到调度中心通知会重载对应索引); 资源加载完成会上报调度中心, 在调度中心可便捷的监控各产品资源加载情况, 此外, 还提供资源扩缩容机制, 以提高服务可用性
  3. 网关会定时同步调度中心的产品-实例映射关系表, 请求经过网关携带appId时, 会根据订阅关系找到对应的实例资源, 再根据负载均衡的策略转发请求
  4. ps: 根据这套架构, 后续将更多服务模块定义成资源, 以自定义产品路由, 灵活控制实例资源, 如在SAAS架构下分配更多的实例资源给到流量更多的产品

综上, 基于产品路由的SM索引服务, 其架构本质上是为了解决SM索引占用内存高的问题, 至此衍生出的资源调度, 监控, 扩缩容体系. 但其缺点也比较明显, 依赖固定ip, 耦合性强, 迁移难度大, 不易维护等

FAQ索引构建架构设计

索引构建全流程设计

1745408944774

如上流程图:

  1. 当知识库(知识库与FAQ属于两个系统,语料数据需通过接口同步)发布语料后,faq-builder会先将通知暂存。当定时任务触发索引构建时,才合并通知,进行语料同步,同步完成将生成对应索引的构建任务落库, 并开启任务调度
  2. 任务调度器启动时,会查询库中所有的新建/执行中的任务,并将任务分发至空闲的faq-builder实例异步并行执行,具体执行细节如上右侧的流程图
  3. 当所有的任务执行完成后,会通知engine切换索引

此架构设计的好处在于:

  1. 索引构建任务落库,使每一次索引构建,每条构建链路可查询可监控
  2. 控制索引构建的并发,避免大批量语料的同时处理直接搞崩上下游服务;同时异步多实例并行执行,也提升了索引构建的性能
  3. 完善了构建任务的过期处理,超时自动重试,以及失败任务对应资源的定时清理(图中未展现)

索引构建任务状态流转

如下图,索引构建任务的最终状态有,废弃,执行成功与执行失败,同时在所有的任务状态均允许手动重试

1745408953510

此架构设计的好处在于:

  1. 构建可中止, 可部分手动重试, 超时会默认自动重试, 同时builder重启会继续调度任务(图中未展现), 大大增强了索引构建的灵活性和稳定性
  2. 索引通过taskId(类似releaseId的概念)来标识版本, 进行切换, 保证了数据的一致性

可中断的索引构建线程

示例代码如下:

  • 使用running标识程序正在执行, 线程执行run方法时, 会先将自身以taskId注册到监控map中
  • 顶层抽象类AbstractStoppableRunnable对外暴露stop方法, 允许外界变更running标识以尝试中止程序
java
public abstract class AbstractStoppableRunnable implements Runnable {
    private volatile boolean running;
    private final Map<String, AbstractStoppableRunnable> monitorMap;
    private final String taskId;

    public AbstractStoppableRunnable(String taskId, Map<String, AbstractStoppableRunnable> monitorMap) {
        this.taskId = taskId;
        this.monitorMap = monitorMap;
    }

    @Override
    public void run() {
        try {
            this.running = true;
            monitorMap.put(taskId, this);
            execute();
            if (isRunning()) {
                System.out.println("callback success");
            } else {
                System.out.println("manual stop, callback failure");
            }
        } catch (Exception | Error e) {
            System.out.printf("callback failure, e: %s%n", e.getMessage());
        } finally {
            this.running = false;
            monitorMap.remove(taskId);
        }
    }

    /**
     * 中止
     */
    public void stop() {
        this.running = false;
    }

    /**
     * 实际执行方法, 子类实现
     */
    protected abstract void execute() throws Exception;

    protected boolean isRunning() {
        return running;
    }
}
  • 具体任务实现类通过在每次循环或是长耗时逻辑前, 判断running标识来给予程序中断的能力
java
public class DemoTask extends AbstractStoppableRunnable {
    public DemoTask(String taskId, Map<String, AbstractStoppableRunnable> monitorMap) {
        super(taskId, monitorMap);
    }

    @Override
    protected void execute() {
        for (int i = 0; i < 100; i++) {
            // 每次循环或长耗时逻辑前, 判断中断标志
            if (!isRunning()) {
                System.out.println(Thread.currentThread().getName() + "=========> stopped");
                return;
            }

            // biz code
            System.out.println(Thread.currentThread().getName() + "=========> running, i= " + i);
        }

    }
}

测试如下

java
public class TheadMain {

    public static void main(String[] args) throws InterruptedException {
        Map<String, AbstractStoppableRunnable> monitorMap = new ConcurrentHashMap<>(1);
        /**
         * 使用线程池应注意打日志, 线程池默认catch异常不做任何处理
         */
        ThreadPoolExecutor executor = new ThreadPoolExecutor(5, 10, 100, TimeUnit.SECONDS, new LinkedBlockingQueue<>(100));
        /**
         *  千万不能使用以下这两种写法,这两种只是异步创建了StoppableRunnable这个对象,返回的是void
         *  executor.execute(()->new StoppableRunnable());
         *  executor.execute(StoppableRunnable::new);
         */
        executor.execute(new DemoTask("20230618", monitorMap));

        // 运行5秒后中止
        TimeUnit.SECONDS.sleep(5);
        if (monitorMap.containsKey("20230618")) {
            monitorMap.get("20230618").stop();
        }
    }
}

-------------------------
C:\Java\jdk1.8.0_351\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:10888,suspend=y,server=n -javaagent:C:\Users\xinzhang0618\AppData\Local\JetBrains\IntelliJIdea2022.2\captureAgent\debugger-agent.jar -Dfile.encoding=UTF-8 -classpath "C:\Java\jdk1.8.0_351\jre\lib\charsets.jar;C:\Java\jdk1.8.0_351\jre\lib\deploy.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\access-bridge-64.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\cldrdata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\dnsns.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jaccess.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jfxrt.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\localedata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\nashorn.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunec.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunjce_provider.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunmscapi.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunpkcs11.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\zipfs.jar;C:\Java\jdk1.8.0_351\jre\lib\javaws.jar;C:\Java\jdk1.8.0_351\jre\lib\jce.jar;C:\Java\jdk1.8.0_351\jre\lib\jfr.jar;C:\Java\jdk1.8.0_351\jre\lib\jfxswt.jar;C:\Java\jdk1.8.0_351\jre\lib\jsse.jar;C:\Java\jdk1.8.0_351\jre\lib\management-agent.jar;C:\Java\jdk1.8.0_351\jre\lib\plugin.jar;C:\Java\jdk1.8.0_351\jre\lib\resources.jar;C:\Java\jdk1.8.0_351\jre\lib\rt.jar;C:\xz\code\pa-ai-common\demo\demo-springboot\target\classes;C:\xzRepository\org\springframework\retry\spring-retry\1.2.5.RELEASE\spring-retry-1.2.5.RELEASE.jar;C:\xzRepository\org\springframework\spring-core\5.2.3.RELEASE\spring-core-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jcl\5.2.3.RELEASE\spring-jcl-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aspects\5.2.3.RELEASE\spring-aspects-5.2.3.RELEASE.jar;C:\xzRepository\org\aspectj\aspectjweaver\1.9.5\aspectjweaver-1.9.5.jar;C:\xzRepository\org\springframework\spring-orm\5.2.3.RELEASE\spring-orm-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-beans\5.2.3.RELEASE\spring-beans-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jdbc\5.2.3.RELEASE\spring-jdbc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-tx\5.2.3.RELEASE\spring-tx-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-web\2.2.3.RELEASE\spring-boot-starter-web-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter\2.2.3.RELEASE\spring-boot-starter-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot\2.2.3.RELEASE\spring-boot-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-autoconfigure\2.2.3.RELEASE\spring-boot-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-logging\2.2.3.RELEASE\spring-boot-starter-logging-2.2.3.RELEASE.jar;C:\xzRepository\ch\qos\logback\logback-classic\1.2.3\logback-classic-1.2.3.jar;C:\xzRepository\ch\qos\logback\logback-core\1.2.3\logback-core-1.2.3.jar;C:\xzRepository\org\apache\logging\log4j\log4j-to-slf4j\2.12.1\log4j-to-slf4j-2.12.1.jar;C:\xzRepository\org\apache\logging\log4j\log4j-api\2.12.1\log4j-api-2.12.1.jar;C:\xzRepository\org\slf4j\jul-to-slf4j\1.7.30\jul-to-slf4j-1.7.30.jar;C:\xzRepository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;C:\xzRepository\org\yaml\snakeyaml\1.25\snakeyaml-1.25.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-json\2.2.3.RELEASE\spring-boot-starter-json-2.2.3.RELEASE.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-databind\2.10.2\jackson-databind-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-annotations\2.10.2\jackson-annotations-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-core\2.10.2\jackson-core-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jdk8\2.10.2\jackson-datatype-jdk8-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jsr310\2.10.2\jackson-datatype-jsr310-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\module\jackson-module-parameter-names\2.10.2\jackson-module-parameter-names-2.10.2.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-tomcat\2.2.3.RELEASE\spring-boot-starter-tomcat-2.2.3.RELEASE.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-core\9.0.30\tomcat-embed-core-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-el\9.0.30\tomcat-embed-el-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-websocket\9.0.30\tomcat-embed-websocket-9.0.30.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-validation\2.2.3.RELEASE\spring-boot-starter-validation-2.2.3.RELEASE.jar;C:\xzRepository\jakarta\validation\jakarta.validation-api\2.0.2\jakarta.validation-api-2.0.2.jar;C:\xzRepository\org\hibernate\validator\hibernate-validator\6.0.18.Final\hibernate-validator-6.0.18.Final.jar;C:\xzRepository\org\jboss\logging\jboss-logging\3.4.1.Final\jboss-logging-3.4.1.Final.jar;C:\xzRepository\com\fasterxml\classmate\1.5.1\classmate-1.5.1.jar;C:\xzRepository\org\springframework\spring-web\5.2.3.RELEASE\spring-web-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-webmvc\5.2.3.RELEASE\spring-webmvc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aop\5.2.3.RELEASE\spring-aop-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-context\5.2.3.RELEASE\spring-context-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-expression\5.2.3.RELEASE\spring-expression-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-test\2.2.3.RELEASE\spring-boot-starter-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test\2.2.3.RELEASE\spring-boot-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test-autoconfigure\2.2.3.RELEASE\spring-boot-test-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\com\jayway\jsonpath\json-path\2.4.0\json-path-2.4.0.jar;C:\xzRepository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\xzRepository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\xzRepository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\xzRepository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;C:\xzRepository\jakarta\xml\bind\jakarta.xml.bind-api\2.3.2\jakarta.xml.bind-api-2.3.2.jar;C:\xzRepository\jakarta\activation\jakarta.activation-api\1.2.1\jakarta.activation-api-1.2.1.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter\5.5.2\junit-jupiter-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-api\5.5.2\junit-jupiter-api-5.5.2.jar;C:\xzRepository\org\opentest4j\opentest4j\1.2.0\opentest4j-1.2.0.jar;C:\xzRepository\org\junit\platform\junit-platform-commons\1.5.2\junit-platform-commons-1.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-params\5.5.2\junit-jupiter-params-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-engine\5.5.2\junit-jupiter-engine-5.5.2.jar;C:\xzRepository\org\junit\vintage\junit-vintage-engine\5.5.2\junit-vintage-engine-5.5.2.jar;C:\xzRepository\org\apiguardian\apiguardian-api\1.1.0\apiguardian-api-1.1.0.jar;C:\xzRepository\org\junit\platform\junit-platform-engine\1.5.2\junit-platform-engine-1.5.2.jar;C:\xzRepository\junit\junit\4.12\junit-4.12.jar;C:\xzRepository\org\mockito\mockito-junit-jupiter\3.1.0\mockito-junit-jupiter-3.1.0.jar;C:\xzRepository\org\assertj\assertj-core\3.13.2\assertj-core-3.13.2.jar;C:\xzRepository\org\hamcrest\hamcrest\2.1\hamcrest-2.1.jar;C:\xzRepository\org\mockito\mockito-core\3.1.0\mockito-core-3.1.0.jar;C:\xzRepository\net\bytebuddy\byte-buddy\1.10.6\byte-buddy-1.10.6.jar;C:\xzRepository\net\bytebuddy\byte-buddy-agent\1.10.6\byte-buddy-agent-1.10.6.jar;C:\xzRepository\org\objenesis\objenesis\2.6\objenesis-2.6.jar;C:\xzRepository\org\skyscreamer\jsonassert\1.5.0\jsonassert-1.5.0.jar;C:\xzRepository\com\vaadin\external\google\android-json\0.0.20131108.vaadin1\android-json-0.0.20131108.vaadin1.jar;C:\xzRepository\org\springframework\spring-test\5.2.3.RELEASE\spring-test-5.2.3.RELEASE.jar;C:\xzRepository\org\xmlunit\xmlunit-core\2.6.3\xmlunit-core-2.6.3.jar;C:\xzRepository\com\alibaba\fastjson\2.0.33\fastjson-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2-extension\2.0.33\fastjson2-extension-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2\2.0.33\fastjson2-2.0.33.jar;C:\xzRepository\org\projectlombok\lombok\1.18.22\lombok-1.18.22.jar;C:\xzRepository\cn\hutool\hutool-all\5.8.19\hutool-all-5.8.19.jar;C:\Program Files\JetBrains\IntelliJ IDEA 2022.2.3\lib\idea_rt.jar" com.pingan.lcloud.demo.thread.TheadMain
Connected to the target VM, address: '127.0.0.1:10888', transport: 'socket'
pool-1-thread-1=========> running, i= 0
pool-1-thread-1=========> running, i= 1
pool-1-thread-1=========> running, i= 2
pool-1-thread-1=========> running, i= 3
pool-1-thread-1=========> running, i= 4
pool-1-thread-1=========> stopped
manual stop, callback failure

可中断的异步任务调度器

  • 示例代码如下, 这里利用了线程的interrupt标识来实现线程的启停
java
public class TaskDispatcher {
    private Thread dispatchThread;

    public void startDispatch() {
        this.dispatchThread = new Thread(new DispatchRunnable());
        this.dispatchThread.start();
    }

    public void stopDispatch() {
        this.dispatchThread.interrupt();
    }

    /**
     * 调度任务
     */
    public static class DispatchRunnable implements Runnable {

        @Override
        public void run() {
            while (!Thread.currentThread().isInterrupted()) {
                List<Integer> tasks = mockListCreatedOrExecutingTasks();
                for (Integer task : tasks) {
                    if (Thread.currentThread().isInterrupted()) {
                        System.out.println(Thread.currentThread().getName() + "========> stopped");
                        return;
                    }

                    // 执行调度详细逻辑
                    System.out.println(Thread.currentThread().getName() + "========> do dispatch, task: " + task);

                    try {
                        TimeUnit.SECONDS.sleep(1);
                    } catch (InterruptedException e) {
                        System.out.println(Thread.currentThread().getName() + "========> stopped while sleep");
                        return;
                    }
                }
            }
        }

        public List<Integer> mockListCreatedOrExecutingTasks() {
            ArrayList<Integer> list = new ArrayList<>();
            for (int i = 0; i < new SecureRandom().nextInt(5); i++) {
                list.add(i);
            }
            return list;
        }
    }
}
java
public class MainTaskDispatcher {
    public static void main(String[] args) throws InterruptedException {
        TaskDispatcher taskDispatcher = new TaskDispatcher();
        taskDispatcher.startDispatch();

        TimeUnit.SECONDS.sleep(2);
        taskDispatcher.stopDispatch();

        System.out.println("===============restart===============");
        taskDispatcher.startDispatch();

        TimeUnit.SECONDS.sleep(3);
        taskDispatcher.stopDispatch();
    }
}

---------------------------
C:\Java\jdk1.8.0_351\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:7202,suspend=y,server=n -javaagent:C:\Users\xinzhang0618\AppData\Local\JetBrains\IntelliJIdea2022.2\captureAgent\debugger-agent.jar -Dfile.encoding=UTF-8 -classpath "C:\Java\jdk1.8.0_351\jre\lib\charsets.jar;C:\Java\jdk1.8.0_351\jre\lib\deploy.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\access-bridge-64.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\cldrdata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\dnsns.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jaccess.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jfxrt.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\localedata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\nashorn.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunec.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunjce_provider.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunmscapi.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunpkcs11.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\zipfs.jar;C:\Java\jdk1.8.0_351\jre\lib\javaws.jar;C:\Java\jdk1.8.0_351\jre\lib\jce.jar;C:\Java\jdk1.8.0_351\jre\lib\jfr.jar;C:\Java\jdk1.8.0_351\jre\lib\jfxswt.jar;C:\Java\jdk1.8.0_351\jre\lib\jsse.jar;C:\Java\jdk1.8.0_351\jre\lib\management-agent.jar;C:\Java\jdk1.8.0_351\jre\lib\plugin.jar;C:\Java\jdk1.8.0_351\jre\lib\resources.jar;C:\Java\jdk1.8.0_351\jre\lib\rt.jar;C:\xz\code\pa-ai-common\demo\demo-springboot\target\classes;C:\xzRepository\org\springframework\retry\spring-retry\1.2.5.RELEASE\spring-retry-1.2.5.RELEASE.jar;C:\xzRepository\org\springframework\spring-core\5.2.3.RELEASE\spring-core-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jcl\5.2.3.RELEASE\spring-jcl-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aspects\5.2.3.RELEASE\spring-aspects-5.2.3.RELEASE.jar;C:\xzRepository\org\aspectj\aspectjweaver\1.9.5\aspectjweaver-1.9.5.jar;C:\xzRepository\org\springframework\spring-orm\5.2.3.RELEASE\spring-orm-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-beans\5.2.3.RELEASE\spring-beans-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jdbc\5.2.3.RELEASE\spring-jdbc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-tx\5.2.3.RELEASE\spring-tx-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-web\2.2.3.RELEASE\spring-boot-starter-web-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter\2.2.3.RELEASE\spring-boot-starter-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot\2.2.3.RELEASE\spring-boot-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-autoconfigure\2.2.3.RELEASE\spring-boot-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-logging\2.2.3.RELEASE\spring-boot-starter-logging-2.2.3.RELEASE.jar;C:\xzRepository\ch\qos\logback\logback-classic\1.2.3\logback-classic-1.2.3.jar;C:\xzRepository\ch\qos\logback\logback-core\1.2.3\logback-core-1.2.3.jar;C:\xzRepository\org\apache\logging\log4j\log4j-to-slf4j\2.12.1\log4j-to-slf4j-2.12.1.jar;C:\xzRepository\org\apache\logging\log4j\log4j-api\2.12.1\log4j-api-2.12.1.jar;C:\xzRepository\org\slf4j\jul-to-slf4j\1.7.30\jul-to-slf4j-1.7.30.jar;C:\xzRepository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;C:\xzRepository\org\yaml\snakeyaml\1.25\snakeyaml-1.25.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-json\2.2.3.RELEASE\spring-boot-starter-json-2.2.3.RELEASE.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-databind\2.10.2\jackson-databind-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-annotations\2.10.2\jackson-annotations-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-core\2.10.2\jackson-core-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jdk8\2.10.2\jackson-datatype-jdk8-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jsr310\2.10.2\jackson-datatype-jsr310-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\module\jackson-module-parameter-names\2.10.2\jackson-module-parameter-names-2.10.2.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-tomcat\2.2.3.RELEASE\spring-boot-starter-tomcat-2.2.3.RELEASE.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-core\9.0.30\tomcat-embed-core-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-el\9.0.30\tomcat-embed-el-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-websocket\9.0.30\tomcat-embed-websocket-9.0.30.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-validation\2.2.3.RELEASE\spring-boot-starter-validation-2.2.3.RELEASE.jar;C:\xzRepository\jakarta\validation\jakarta.validation-api\2.0.2\jakarta.validation-api-2.0.2.jar;C:\xzRepository\org\hibernate\validator\hibernate-validator\6.0.18.Final\hibernate-validator-6.0.18.Final.jar;C:\xzRepository\org\jboss\logging\jboss-logging\3.4.1.Final\jboss-logging-3.4.1.Final.jar;C:\xzRepository\com\fasterxml\classmate\1.5.1\classmate-1.5.1.jar;C:\xzRepository\org\springframework\spring-web\5.2.3.RELEASE\spring-web-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-webmvc\5.2.3.RELEASE\spring-webmvc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aop\5.2.3.RELEASE\spring-aop-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-context\5.2.3.RELEASE\spring-context-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-expression\5.2.3.RELEASE\spring-expression-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-test\2.2.3.RELEASE\spring-boot-starter-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test\2.2.3.RELEASE\spring-boot-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test-autoconfigure\2.2.3.RELEASE\spring-boot-test-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\com\jayway\jsonpath\json-path\2.4.0\json-path-2.4.0.jar;C:\xzRepository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\xzRepository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\xzRepository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\xzRepository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;C:\xzRepository\jakarta\xml\bind\jakarta.xml.bind-api\2.3.2\jakarta.xml.bind-api-2.3.2.jar;C:\xzRepository\jakarta\activation\jakarta.activation-api\1.2.1\jakarta.activation-api-1.2.1.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter\5.5.2\junit-jupiter-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-api\5.5.2\junit-jupiter-api-5.5.2.jar;C:\xzRepository\org\opentest4j\opentest4j\1.2.0\opentest4j-1.2.0.jar;C:\xzRepository\org\junit\platform\junit-platform-commons\1.5.2\junit-platform-commons-1.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-params\5.5.2\junit-jupiter-params-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-engine\5.5.2\junit-jupiter-engine-5.5.2.jar;C:\xzRepository\org\junit\vintage\junit-vintage-engine\5.5.2\junit-vintage-engine-5.5.2.jar;C:\xzRepository\org\apiguardian\apiguardian-api\1.1.0\apiguardian-api-1.1.0.jar;C:\xzRepository\org\junit\platform\junit-platform-engine\1.5.2\junit-platform-engine-1.5.2.jar;C:\xzRepository\junit\junit\4.12\junit-4.12.jar;C:\xzRepository\org\mockito\mockito-junit-jupiter\3.1.0\mockito-junit-jupiter-3.1.0.jar;C:\xzRepository\org\assertj\assertj-core\3.13.2\assertj-core-3.13.2.jar;C:\xzRepository\org\hamcrest\hamcrest\2.1\hamcrest-2.1.jar;C:\xzRepository\org\mockito\mockito-core\3.1.0\mockito-core-3.1.0.jar;C:\xzRepository\net\bytebuddy\byte-buddy\1.10.6\byte-buddy-1.10.6.jar;C:\xzRepository\net\bytebuddy\byte-buddy-agent\1.10.6\byte-buddy-agent-1.10.6.jar;C:\xzRepository\org\objenesis\objenesis\2.6\objenesis-2.6.jar;C:\xzRepository\org\skyscreamer\jsonassert\1.5.0\jsonassert-1.5.0.jar;C:\xzRepository\com\vaadin\external\google\android-json\0.0.20131108.vaadin1\android-json-0.0.20131108.vaadin1.jar;C:\xzRepository\org\springframework\spring-test\5.2.3.RELEASE\spring-test-5.2.3.RELEASE.jar;C:\xzRepository\org\xmlunit\xmlunit-core\2.6.3\xmlunit-core-2.6.3.jar;C:\xzRepository\com\alibaba\fastjson\2.0.33\fastjson-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2-extension\2.0.33\fastjson2-extension-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2\2.0.33\fastjson2-2.0.33.jar;C:\xzRepository\org\projectlombok\lombok\1.18.22\lombok-1.18.22.jar;C:\xzRepository\cn\hutool\hutool-all\5.8.19\hutool-all-5.8.19.jar;C:\Program Files\JetBrains\IntelliJ IDEA 2022.2.3\lib\idea_rt.jar" com.pingan.lcloud.demo.thread.MainTaskDispatcher
Connected to the target VM, address: '127.0.0.1:7202', transport: 'socket'
Thread-0========> do dispatch, task: 0
Thread-0========> do dispatch, task: 0
===============restart===============
Thread-0========> stopped while sleep
Thread-1========> do dispatch, task: 0
Thread-1========> do dispatch, task: 0
Thread-1========> do dispatch, task: 0
Thread-1========> stopped while sleep

总结

以上, 即展示了pa在智能问答索引构建的探索架构过程中, 不断完善优化后的解决方案, 其核心复杂度罗列如下:

  1. sm索引占用内存高, SAAS架构下服务多个产品内存消耗大; 引入产品路由机制, 一个实例仅需加载部分产品索引;
  2. sm索引只能全量构建, 且耗时长; 采用离线定时构建;
  3. 索引构建关联服务以及中间件众多, 容易失败, 整体重试耗时长, 代价大; 拆分独立的索引构建任务并持久化, 增强任务的监控, 同时支持任务中断, 自动重试, 以及手动部分重试, 提高构建的灵活性;
  4. 语料较多时, 索引构建会对上下游造成较大的并发压力; 控制构建任务的并发度, 单机仅串行执行;
  5. 需要保证多种索引数据的一致性; 引入索引版本id(taskId)的设计, 当所有索引构建完成才进行索引切换, 以及旧索引资源的卸载, 提高服务的可用性与稳定性.