FAQ索引构建架构设计
此篇为总结在pa工作时接触到的索引构建的架构设计
FAQ索引构建链路
智能问答(FAQ)是基于人工智能相关技术栈搭建的一个对话系统, 是智能客服的核心支撑模块(其余还有任务型机器人, 搜索, 推荐, 输入联想等). 其核心流程与多数检索类服务相似, 即召回+排序, 在召回阶段, FAQ会依赖多种形式的索引如ES索引, 语义索引(SM_NSG, SM_ANNOY), 预处理索引等等.
- 索引构建与服务方式
- 索引构建步骤
- 在线与离线
- es索引支持在线增量更新
- sm索引只能全量更新, 需要将数据全量加载至内存, 因此只能离线全量更新(一般定时深夜), 不影响线上服务; 就算完善了索引切换方式也不行, 索引构建过程中(如向量化)也会对上下游服务造成较大压力, 需要一整个链路针对索引构建的流量做隔离
综上, FAQ索引构建链路有以下特征:
- 关联服务以及中间件较多, 导致构建过程容易失败
- 需要构建es, sm多种形式的索引, 只能离线构建
因此, 其架构设计有以下需求:
- 需要完善构建失败或异常的监控以及处理, 支持灵活的构建策略
- 需要保证数据一致性, 不能出现例如es构建成功, sm构建失败导致召回异常的情况
- 此外, 要减少构建耗时, 由于sm只能全量构建, 在语料数据较多时(>100w)构建耗时过长,且会对上下游服务造成较大的并发压力
基于产品路由的SM索引服务
如上文所述, SM索引加载至内存才能对外提供检索服务(ps: 向量数据库是另一种更佳的方案). FAQ作为对话系统的支撑模块, 常会服务于多个场景的客服机器人产品, 若按传统SAAS架构设计, 需加载所有产品的SM索引, 将占用极大的内存, 且无法拓展维护, 因此提出如下图的基于产品路由的SM索引服务
- 由调度中心维护一套产品与机器实例的映射关系, 每个索引服务实例仅需加载部分产品的内存索引, 以降低对内存资源的消耗. 如上图index-service有3台实例, 第一台实例仅需加载1, 2, 3, 4这几个产品
- 在索引构建时, 内存索引会被调度中心根据索引服务实例内存情况, 分配给到指定的实例资源去加载; (服务重启后接收到调度中心通知会重载对应索引); 资源加载完成会上报调度中心, 在调度中心可便捷的监控各产品资源加载情况, 此外, 还提供资源扩缩容机制, 以提高服务可用性
- 网关会定时同步调度中心的产品-实例映射关系表, 请求经过网关携带appId时, 会根据订阅关系找到对应的实例资源, 再根据负载均衡的策略转发请求
- ps: 根据这套架构, 后续将更多服务模块定义成资源, 以自定义产品路由, 灵活控制实例资源, 如在SAAS架构下分配更多的实例资源给到流量更多的产品
综上, 基于产品路由的SM索引服务, 其架构本质上是为了解决SM索引占用内存高的问题, 至此衍生出的资源调度, 监控, 扩缩容体系. 但其缺点也比较明显, 依赖固定ip, 耦合性强, 迁移难度大, 不易维护等
FAQ索引构建架构设计
索引构建全流程设计
如上流程图:
- 当知识库(知识库与FAQ属于两个系统,语料数据需通过接口同步)发布语料后,faq-builder会先将通知暂存。当定时任务触发索引构建时,才合并通知,进行语料同步,同步完成将生成对应索引的构建任务落库, 并开启任务调度
- 任务调度器启动时,会查询库中所有的新建/执行中的任务,并将任务分发至空闲的faq-builder实例异步并行执行,具体执行细节如上右侧的流程图
- 当所有的任务执行完成后,会通知engine切换索引
此架构设计的好处在于:
- 索引构建任务落库,使每一次索引构建,每条构建链路可查询可监控
- 控制索引构建的并发,避免大批量语料的同时处理直接搞崩上下游服务;同时异步多实例并行执行,也提升了索引构建的性能
- 完善了构建任务的过期处理,超时自动重试,以及失败任务对应资源的定时清理(图中未展现)
索引构建任务状态流转
如下图,索引构建任务的最终状态有,废弃,执行成功与执行失败,同时在所有的任务状态均允许手动重试
此架构设计的好处在于:
- 构建可中止, 可部分手动重试, 超时会默认自动重试, 同时builder重启会继续调度任务(图中未展现), 大大增强了索引构建的灵活性和稳定性
- 索引通过taskId(类似releaseId的概念)来标识版本, 进行切换, 保证了数据的一致性
可中断的索引构建线程
示例代码如下:
- 使用running标识程序正在执行, 线程执行run方法时, 会先将自身以taskId注册到监控map中
- 顶层抽象类AbstractStoppableRunnable对外暴露stop方法, 允许外界变更running标识以尝试中止程序
java
public abstract class AbstractStoppableRunnable implements Runnable {
private volatile boolean running;
private final Map<String, AbstractStoppableRunnable> monitorMap;
private final String taskId;
public AbstractStoppableRunnable(String taskId, Map<String, AbstractStoppableRunnable> monitorMap) {
this.taskId = taskId;
this.monitorMap = monitorMap;
}
@Override
public void run() {
try {
this.running = true;
monitorMap.put(taskId, this);
execute();
if (isRunning()) {
System.out.println("callback success");
} else {
System.out.println("manual stop, callback failure");
}
} catch (Exception | Error e) {
System.out.printf("callback failure, e: %s%n", e.getMessage());
} finally {
this.running = false;
monitorMap.remove(taskId);
}
}
/**
* 中止
*/
public void stop() {
this.running = false;
}
/**
* 实际执行方法, 子类实现
*/
protected abstract void execute() throws Exception;
protected boolean isRunning() {
return running;
}
}
- 具体任务实现类通过在每次循环或是长耗时逻辑前, 判断running标识来给予程序中断的能力
java
public class DemoTask extends AbstractStoppableRunnable {
public DemoTask(String taskId, Map<String, AbstractStoppableRunnable> monitorMap) {
super(taskId, monitorMap);
}
@Override
protected void execute() {
for (int i = 0; i < 100; i++) {
// 每次循环或长耗时逻辑前, 判断中断标志
if (!isRunning()) {
System.out.println(Thread.currentThread().getName() + "=========> stopped");
return;
}
// biz code
System.out.println(Thread.currentThread().getName() + "=========> running, i= " + i);
}
}
}
测试如下
java
public class TheadMain {
public static void main(String[] args) throws InterruptedException {
Map<String, AbstractStoppableRunnable> monitorMap = new ConcurrentHashMap<>(1);
/**
* 使用线程池应注意打日志, 线程池默认catch异常不做任何处理
*/
ThreadPoolExecutor executor = new ThreadPoolExecutor(5, 10, 100, TimeUnit.SECONDS, new LinkedBlockingQueue<>(100));
/**
* 千万不能使用以下这两种写法,这两种只是异步创建了StoppableRunnable这个对象,返回的是void
* executor.execute(()->new StoppableRunnable());
* executor.execute(StoppableRunnable::new);
*/
executor.execute(new DemoTask("20230618", monitorMap));
// 运行5秒后中止
TimeUnit.SECONDS.sleep(5);
if (monitorMap.containsKey("20230618")) {
monitorMap.get("20230618").stop();
}
}
}
-------------------------
C:\Java\jdk1.8.0_351\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:10888,suspend=y,server=n -javaagent:C:\Users\xinzhang0618\AppData\Local\JetBrains\IntelliJIdea2022.2\captureAgent\debugger-agent.jar -Dfile.encoding=UTF-8 -classpath "C:\Java\jdk1.8.0_351\jre\lib\charsets.jar;C:\Java\jdk1.8.0_351\jre\lib\deploy.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\access-bridge-64.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\cldrdata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\dnsns.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jaccess.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jfxrt.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\localedata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\nashorn.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunec.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunjce_provider.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunmscapi.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunpkcs11.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\zipfs.jar;C:\Java\jdk1.8.0_351\jre\lib\javaws.jar;C:\Java\jdk1.8.0_351\jre\lib\jce.jar;C:\Java\jdk1.8.0_351\jre\lib\jfr.jar;C:\Java\jdk1.8.0_351\jre\lib\jfxswt.jar;C:\Java\jdk1.8.0_351\jre\lib\jsse.jar;C:\Java\jdk1.8.0_351\jre\lib\management-agent.jar;C:\Java\jdk1.8.0_351\jre\lib\plugin.jar;C:\Java\jdk1.8.0_351\jre\lib\resources.jar;C:\Java\jdk1.8.0_351\jre\lib\rt.jar;C:\xz\code\pa-ai-common\demo\demo-springboot\target\classes;C:\xzRepository\org\springframework\retry\spring-retry\1.2.5.RELEASE\spring-retry-1.2.5.RELEASE.jar;C:\xzRepository\org\springframework\spring-core\5.2.3.RELEASE\spring-core-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jcl\5.2.3.RELEASE\spring-jcl-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aspects\5.2.3.RELEASE\spring-aspects-5.2.3.RELEASE.jar;C:\xzRepository\org\aspectj\aspectjweaver\1.9.5\aspectjweaver-1.9.5.jar;C:\xzRepository\org\springframework\spring-orm\5.2.3.RELEASE\spring-orm-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-beans\5.2.3.RELEASE\spring-beans-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jdbc\5.2.3.RELEASE\spring-jdbc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-tx\5.2.3.RELEASE\spring-tx-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-web\2.2.3.RELEASE\spring-boot-starter-web-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter\2.2.3.RELEASE\spring-boot-starter-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot\2.2.3.RELEASE\spring-boot-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-autoconfigure\2.2.3.RELEASE\spring-boot-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-logging\2.2.3.RELEASE\spring-boot-starter-logging-2.2.3.RELEASE.jar;C:\xzRepository\ch\qos\logback\logback-classic\1.2.3\logback-classic-1.2.3.jar;C:\xzRepository\ch\qos\logback\logback-core\1.2.3\logback-core-1.2.3.jar;C:\xzRepository\org\apache\logging\log4j\log4j-to-slf4j\2.12.1\log4j-to-slf4j-2.12.1.jar;C:\xzRepository\org\apache\logging\log4j\log4j-api\2.12.1\log4j-api-2.12.1.jar;C:\xzRepository\org\slf4j\jul-to-slf4j\1.7.30\jul-to-slf4j-1.7.30.jar;C:\xzRepository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;C:\xzRepository\org\yaml\snakeyaml\1.25\snakeyaml-1.25.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-json\2.2.3.RELEASE\spring-boot-starter-json-2.2.3.RELEASE.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-databind\2.10.2\jackson-databind-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-annotations\2.10.2\jackson-annotations-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-core\2.10.2\jackson-core-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jdk8\2.10.2\jackson-datatype-jdk8-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jsr310\2.10.2\jackson-datatype-jsr310-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\module\jackson-module-parameter-names\2.10.2\jackson-module-parameter-names-2.10.2.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-tomcat\2.2.3.RELEASE\spring-boot-starter-tomcat-2.2.3.RELEASE.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-core\9.0.30\tomcat-embed-core-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-el\9.0.30\tomcat-embed-el-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-websocket\9.0.30\tomcat-embed-websocket-9.0.30.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-validation\2.2.3.RELEASE\spring-boot-starter-validation-2.2.3.RELEASE.jar;C:\xzRepository\jakarta\validation\jakarta.validation-api\2.0.2\jakarta.validation-api-2.0.2.jar;C:\xzRepository\org\hibernate\validator\hibernate-validator\6.0.18.Final\hibernate-validator-6.0.18.Final.jar;C:\xzRepository\org\jboss\logging\jboss-logging\3.4.1.Final\jboss-logging-3.4.1.Final.jar;C:\xzRepository\com\fasterxml\classmate\1.5.1\classmate-1.5.1.jar;C:\xzRepository\org\springframework\spring-web\5.2.3.RELEASE\spring-web-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-webmvc\5.2.3.RELEASE\spring-webmvc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aop\5.2.3.RELEASE\spring-aop-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-context\5.2.3.RELEASE\spring-context-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-expression\5.2.3.RELEASE\spring-expression-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-test\2.2.3.RELEASE\spring-boot-starter-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test\2.2.3.RELEASE\spring-boot-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test-autoconfigure\2.2.3.RELEASE\spring-boot-test-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\com\jayway\jsonpath\json-path\2.4.0\json-path-2.4.0.jar;C:\xzRepository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\xzRepository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\xzRepository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\xzRepository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;C:\xzRepository\jakarta\xml\bind\jakarta.xml.bind-api\2.3.2\jakarta.xml.bind-api-2.3.2.jar;C:\xzRepository\jakarta\activation\jakarta.activation-api\1.2.1\jakarta.activation-api-1.2.1.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter\5.5.2\junit-jupiter-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-api\5.5.2\junit-jupiter-api-5.5.2.jar;C:\xzRepository\org\opentest4j\opentest4j\1.2.0\opentest4j-1.2.0.jar;C:\xzRepository\org\junit\platform\junit-platform-commons\1.5.2\junit-platform-commons-1.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-params\5.5.2\junit-jupiter-params-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-engine\5.5.2\junit-jupiter-engine-5.5.2.jar;C:\xzRepository\org\junit\vintage\junit-vintage-engine\5.5.2\junit-vintage-engine-5.5.2.jar;C:\xzRepository\org\apiguardian\apiguardian-api\1.1.0\apiguardian-api-1.1.0.jar;C:\xzRepository\org\junit\platform\junit-platform-engine\1.5.2\junit-platform-engine-1.5.2.jar;C:\xzRepository\junit\junit\4.12\junit-4.12.jar;C:\xzRepository\org\mockito\mockito-junit-jupiter\3.1.0\mockito-junit-jupiter-3.1.0.jar;C:\xzRepository\org\assertj\assertj-core\3.13.2\assertj-core-3.13.2.jar;C:\xzRepository\org\hamcrest\hamcrest\2.1\hamcrest-2.1.jar;C:\xzRepository\org\mockito\mockito-core\3.1.0\mockito-core-3.1.0.jar;C:\xzRepository\net\bytebuddy\byte-buddy\1.10.6\byte-buddy-1.10.6.jar;C:\xzRepository\net\bytebuddy\byte-buddy-agent\1.10.6\byte-buddy-agent-1.10.6.jar;C:\xzRepository\org\objenesis\objenesis\2.6\objenesis-2.6.jar;C:\xzRepository\org\skyscreamer\jsonassert\1.5.0\jsonassert-1.5.0.jar;C:\xzRepository\com\vaadin\external\google\android-json\0.0.20131108.vaadin1\android-json-0.0.20131108.vaadin1.jar;C:\xzRepository\org\springframework\spring-test\5.2.3.RELEASE\spring-test-5.2.3.RELEASE.jar;C:\xzRepository\org\xmlunit\xmlunit-core\2.6.3\xmlunit-core-2.6.3.jar;C:\xzRepository\com\alibaba\fastjson\2.0.33\fastjson-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2-extension\2.0.33\fastjson2-extension-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2\2.0.33\fastjson2-2.0.33.jar;C:\xzRepository\org\projectlombok\lombok\1.18.22\lombok-1.18.22.jar;C:\xzRepository\cn\hutool\hutool-all\5.8.19\hutool-all-5.8.19.jar;C:\Program Files\JetBrains\IntelliJ IDEA 2022.2.3\lib\idea_rt.jar" com.pingan.lcloud.demo.thread.TheadMain
Connected to the target VM, address: '127.0.0.1:10888', transport: 'socket'
pool-1-thread-1=========> running, i= 0
pool-1-thread-1=========> running, i= 1
pool-1-thread-1=========> running, i= 2
pool-1-thread-1=========> running, i= 3
pool-1-thread-1=========> running, i= 4
pool-1-thread-1=========> stopped
manual stop, callback failure
可中断的异步任务调度器
- 示例代码如下, 这里利用了线程的interrupt标识来实现线程的启停
java
public class TaskDispatcher {
private Thread dispatchThread;
public void startDispatch() {
this.dispatchThread = new Thread(new DispatchRunnable());
this.dispatchThread.start();
}
public void stopDispatch() {
this.dispatchThread.interrupt();
}
/**
* 调度任务
*/
public static class DispatchRunnable implements Runnable {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
List<Integer> tasks = mockListCreatedOrExecutingTasks();
for (Integer task : tasks) {
if (Thread.currentThread().isInterrupted()) {
System.out.println(Thread.currentThread().getName() + "========> stopped");
return;
}
// 执行调度详细逻辑
System.out.println(Thread.currentThread().getName() + "========> do dispatch, task: " + task);
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
System.out.println(Thread.currentThread().getName() + "========> stopped while sleep");
return;
}
}
}
}
public List<Integer> mockListCreatedOrExecutingTasks() {
ArrayList<Integer> list = new ArrayList<>();
for (int i = 0; i < new SecureRandom().nextInt(5); i++) {
list.add(i);
}
return list;
}
}
}
java
public class MainTaskDispatcher {
public static void main(String[] args) throws InterruptedException {
TaskDispatcher taskDispatcher = new TaskDispatcher();
taskDispatcher.startDispatch();
TimeUnit.SECONDS.sleep(2);
taskDispatcher.stopDispatch();
System.out.println("===============restart===============");
taskDispatcher.startDispatch();
TimeUnit.SECONDS.sleep(3);
taskDispatcher.stopDispatch();
}
}
---------------------------
C:\Java\jdk1.8.0_351\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:7202,suspend=y,server=n -javaagent:C:\Users\xinzhang0618\AppData\Local\JetBrains\IntelliJIdea2022.2\captureAgent\debugger-agent.jar -Dfile.encoding=UTF-8 -classpath "C:\Java\jdk1.8.0_351\jre\lib\charsets.jar;C:\Java\jdk1.8.0_351\jre\lib\deploy.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\access-bridge-64.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\cldrdata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\dnsns.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jaccess.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\jfxrt.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\localedata.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\nashorn.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunec.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunjce_provider.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunmscapi.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\sunpkcs11.jar;C:\Java\jdk1.8.0_351\jre\lib\ext\zipfs.jar;C:\Java\jdk1.8.0_351\jre\lib\javaws.jar;C:\Java\jdk1.8.0_351\jre\lib\jce.jar;C:\Java\jdk1.8.0_351\jre\lib\jfr.jar;C:\Java\jdk1.8.0_351\jre\lib\jfxswt.jar;C:\Java\jdk1.8.0_351\jre\lib\jsse.jar;C:\Java\jdk1.8.0_351\jre\lib\management-agent.jar;C:\Java\jdk1.8.0_351\jre\lib\plugin.jar;C:\Java\jdk1.8.0_351\jre\lib\resources.jar;C:\Java\jdk1.8.0_351\jre\lib\rt.jar;C:\xz\code\pa-ai-common\demo\demo-springboot\target\classes;C:\xzRepository\org\springframework\retry\spring-retry\1.2.5.RELEASE\spring-retry-1.2.5.RELEASE.jar;C:\xzRepository\org\springframework\spring-core\5.2.3.RELEASE\spring-core-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jcl\5.2.3.RELEASE\spring-jcl-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aspects\5.2.3.RELEASE\spring-aspects-5.2.3.RELEASE.jar;C:\xzRepository\org\aspectj\aspectjweaver\1.9.5\aspectjweaver-1.9.5.jar;C:\xzRepository\org\springframework\spring-orm\5.2.3.RELEASE\spring-orm-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-beans\5.2.3.RELEASE\spring-beans-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-jdbc\5.2.3.RELEASE\spring-jdbc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-tx\5.2.3.RELEASE\spring-tx-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-web\2.2.3.RELEASE\spring-boot-starter-web-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter\2.2.3.RELEASE\spring-boot-starter-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot\2.2.3.RELEASE\spring-boot-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-autoconfigure\2.2.3.RELEASE\spring-boot-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-logging\2.2.3.RELEASE\spring-boot-starter-logging-2.2.3.RELEASE.jar;C:\xzRepository\ch\qos\logback\logback-classic\1.2.3\logback-classic-1.2.3.jar;C:\xzRepository\ch\qos\logback\logback-core\1.2.3\logback-core-1.2.3.jar;C:\xzRepository\org\apache\logging\log4j\log4j-to-slf4j\2.12.1\log4j-to-slf4j-2.12.1.jar;C:\xzRepository\org\apache\logging\log4j\log4j-api\2.12.1\log4j-api-2.12.1.jar;C:\xzRepository\org\slf4j\jul-to-slf4j\1.7.30\jul-to-slf4j-1.7.30.jar;C:\xzRepository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;C:\xzRepository\org\yaml\snakeyaml\1.25\snakeyaml-1.25.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-json\2.2.3.RELEASE\spring-boot-starter-json-2.2.3.RELEASE.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-databind\2.10.2\jackson-databind-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-annotations\2.10.2\jackson-annotations-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\core\jackson-core\2.10.2\jackson-core-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jdk8\2.10.2\jackson-datatype-jdk8-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\datatype\jackson-datatype-jsr310\2.10.2\jackson-datatype-jsr310-2.10.2.jar;C:\xzRepository\com\fasterxml\jackson\module\jackson-module-parameter-names\2.10.2\jackson-module-parameter-names-2.10.2.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-tomcat\2.2.3.RELEASE\spring-boot-starter-tomcat-2.2.3.RELEASE.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-core\9.0.30\tomcat-embed-core-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-el\9.0.30\tomcat-embed-el-9.0.30.jar;C:\xzRepository\org\apache\tomcat\embed\tomcat-embed-websocket\9.0.30\tomcat-embed-websocket-9.0.30.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-validation\2.2.3.RELEASE\spring-boot-starter-validation-2.2.3.RELEASE.jar;C:\xzRepository\jakarta\validation\jakarta.validation-api\2.0.2\jakarta.validation-api-2.0.2.jar;C:\xzRepository\org\hibernate\validator\hibernate-validator\6.0.18.Final\hibernate-validator-6.0.18.Final.jar;C:\xzRepository\org\jboss\logging\jboss-logging\3.4.1.Final\jboss-logging-3.4.1.Final.jar;C:\xzRepository\com\fasterxml\classmate\1.5.1\classmate-1.5.1.jar;C:\xzRepository\org\springframework\spring-web\5.2.3.RELEASE\spring-web-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-webmvc\5.2.3.RELEASE\spring-webmvc-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-aop\5.2.3.RELEASE\spring-aop-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-context\5.2.3.RELEASE\spring-context-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\spring-expression\5.2.3.RELEASE\spring-expression-5.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-starter-test\2.2.3.RELEASE\spring-boot-starter-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test\2.2.3.RELEASE\spring-boot-test-2.2.3.RELEASE.jar;C:\xzRepository\org\springframework\boot\spring-boot-test-autoconfigure\2.2.3.RELEASE\spring-boot-test-autoconfigure-2.2.3.RELEASE.jar;C:\xzRepository\com\jayway\jsonpath\json-path\2.4.0\json-path-2.4.0.jar;C:\xzRepository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\xzRepository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\xzRepository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\xzRepository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;C:\xzRepository\jakarta\xml\bind\jakarta.xml.bind-api\2.3.2\jakarta.xml.bind-api-2.3.2.jar;C:\xzRepository\jakarta\activation\jakarta.activation-api\1.2.1\jakarta.activation-api-1.2.1.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter\5.5.2\junit-jupiter-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-api\5.5.2\junit-jupiter-api-5.5.2.jar;C:\xzRepository\org\opentest4j\opentest4j\1.2.0\opentest4j-1.2.0.jar;C:\xzRepository\org\junit\platform\junit-platform-commons\1.5.2\junit-platform-commons-1.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-params\5.5.2\junit-jupiter-params-5.5.2.jar;C:\xzRepository\org\junit\jupiter\junit-jupiter-engine\5.5.2\junit-jupiter-engine-5.5.2.jar;C:\xzRepository\org\junit\vintage\junit-vintage-engine\5.5.2\junit-vintage-engine-5.5.2.jar;C:\xzRepository\org\apiguardian\apiguardian-api\1.1.0\apiguardian-api-1.1.0.jar;C:\xzRepository\org\junit\platform\junit-platform-engine\1.5.2\junit-platform-engine-1.5.2.jar;C:\xzRepository\junit\junit\4.12\junit-4.12.jar;C:\xzRepository\org\mockito\mockito-junit-jupiter\3.1.0\mockito-junit-jupiter-3.1.0.jar;C:\xzRepository\org\assertj\assertj-core\3.13.2\assertj-core-3.13.2.jar;C:\xzRepository\org\hamcrest\hamcrest\2.1\hamcrest-2.1.jar;C:\xzRepository\org\mockito\mockito-core\3.1.0\mockito-core-3.1.0.jar;C:\xzRepository\net\bytebuddy\byte-buddy\1.10.6\byte-buddy-1.10.6.jar;C:\xzRepository\net\bytebuddy\byte-buddy-agent\1.10.6\byte-buddy-agent-1.10.6.jar;C:\xzRepository\org\objenesis\objenesis\2.6\objenesis-2.6.jar;C:\xzRepository\org\skyscreamer\jsonassert\1.5.0\jsonassert-1.5.0.jar;C:\xzRepository\com\vaadin\external\google\android-json\0.0.20131108.vaadin1\android-json-0.0.20131108.vaadin1.jar;C:\xzRepository\org\springframework\spring-test\5.2.3.RELEASE\spring-test-5.2.3.RELEASE.jar;C:\xzRepository\org\xmlunit\xmlunit-core\2.6.3\xmlunit-core-2.6.3.jar;C:\xzRepository\com\alibaba\fastjson\2.0.33\fastjson-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2-extension\2.0.33\fastjson2-extension-2.0.33.jar;C:\xzRepository\com\alibaba\fastjson2\fastjson2\2.0.33\fastjson2-2.0.33.jar;C:\xzRepository\org\projectlombok\lombok\1.18.22\lombok-1.18.22.jar;C:\xzRepository\cn\hutool\hutool-all\5.8.19\hutool-all-5.8.19.jar;C:\Program Files\JetBrains\IntelliJ IDEA 2022.2.3\lib\idea_rt.jar" com.pingan.lcloud.demo.thread.MainTaskDispatcher
Connected to the target VM, address: '127.0.0.1:7202', transport: 'socket'
Thread-0========> do dispatch, task: 0
Thread-0========> do dispatch, task: 0
===============restart===============
Thread-0========> stopped while sleep
Thread-1========> do dispatch, task: 0
Thread-1========> do dispatch, task: 0
Thread-1========> do dispatch, task: 0
Thread-1========> stopped while sleep
总结
以上, 即展示了pa在智能问答索引构建的探索架构过程中, 不断完善优化后的解决方案, 其核心复杂度罗列如下:
- sm索引占用内存高, SAAS架构下服务多个产品内存消耗大; 引入产品路由机制, 一个实例仅需加载部分产品索引;
- sm索引只能全量构建, 且耗时长; 采用离线定时构建;
- 索引构建关联服务以及中间件众多, 容易失败, 整体重试耗时长, 代价大; 拆分独立的索引构建任务并持久化, 增强任务的监控, 同时支持任务中断, 自动重试, 以及手动部分重试, 提高构建的灵活性;
- 语料较多时, 索引构建会对上下游造成较大的并发压力; 控制构建任务的并发度, 单机仅串行执行;
- 需要保证多种索引数据的一致性; 引入索引版本id(taskId)的设计, 当所有索引构建完成才进行索引切换, 以及旧索引资源的卸载, 提高服务的可用性与稳定性.