Block Nested Loop Join

引言

在数据库中，join 是经常使用的操作，但其工作量也可能是很大的。那么如何高效地完成 join 操作呢？

这篇 blog 将介绍：

常见的 join 实现方式
如何在 Postgresql 源代码中实现 block nested loop join

常见 join 算法

Simple Nested-Loop Join

简单来说，Simple Nested-Loop Join 就是一个双层 for 循环，通过循环外层表的行数据，逐个与内层表的所有行数据进行比较来获取结果。

用伪代码描述大致如下：

for (each tuple i in outer relation) {
	for (each tuple j in inner relation) {
        if (join_condition(tuple i, tuple j) is true)
      		emit (tuple i, tuple j)
       	else
      		continue
  }
}

特点：

Nested-Loop Join 简单粗暴容易理解，就是通过双层循环比较数据来获得结果，但是这种算法显然太过于粗鲁，如果每个表有1万条数据，那么对数据比较的次数=1万 * 1万 = 1亿次，很显然这种查询效率会非常慢。

当然，数据库肯定不会这么粗暴地 join，所以就出现了后面的两种对Nested-Loop Join 优化算法。在执行 join 查询时，数据库会根据情况选择后面的两种 join 优化算法的进行join查询。

Index Nested-Loop Join

Index Nested-Loop Join 的优化思路主要是：减少内层表数据的匹配次数。

简单来说，Index Nested-Loop Join 就是通过外层表匹配条件，直接与内层表索引进行匹配，避免和内层表的每条记录去进行比较，这样极大的减少了对内层表的匹配次数，从原来的 “匹配次数=外层表行数 * 内层表行数”，变成了 “外层表的行数 * 内层表索引的高度”，极大的提升了 join 的性能。

但是，Index Nested-Loop Join 算法的前提是：匹配的字段必须建立了索引。

Block Nested-Loop join

Block Nested-Loop Join 其优化思路是：减少内层表的扫表次数。

Block Nested-Loop Join 算法意在通过一次性缓存外层表的多条数据，以此来减少内层表的扫表次数，从而达到提升性能的目的。

如果无法使用 Index Nested-Loop Join 时候，一般都会使用 Block Nested-Loop Join 算法。

PostgreSQL 实验

准备工作

下载 postgresql 源代码：

https://www.postgresql.org/ftp/source/v12.0/

在本地编译并安装 postgresql：

# 切换到源代码目录
cd postgresql-12.0

# 初始化 make 配置
/configure --enable-depend --enable-cassert --enable-debug CFLAGS="-O0" --prefix=$HOME/pgsql

# 编译
make

# 安装(一般会安装到$HOME/pgsql)
make install

初始化本地 postgresql 服务器：

若你不幸把 pgsql 装到别的位置，把 $HOME/pgsql 替换成你安装的路径。
1
$HOME/pgsql/bin/initdb -D $HOME/pgsql/data --locale=C

启动 postgresql 服务器：

1	/home/[guanz]/pgsql/bin/pg_ctl -D /home/[guanz]/pgsql/data -l logfile start

连接 postgresql 服务器：
1
$HOME/pgsql/bin/psql postgres

关闭 postgresql 服务器：

“Ctrl + D” 和 \q 均可以退出 postgresql 命令行。

1 2	# This command depends on your installation location, please modify it /home/[guanz]/pgsql/bin/pg_ctl -D /home/[guanz]/pgsql/data stop

加载测试数据：

1
2
3

postgres=# CREATE DATABASE similarity;
postgres-# \c similarity
postgres-# \i /home/[guanz/data/]similarity_data.sql

select_example

源代码解读

虽然源代码很多，但我们只关注以下几个文件：

src/backend/executor/nodeNestloop.c
src/include/executor/nodeNestloop.h
src/include/nodes/execnodes.h

Nested-Loop 主要算法都在 nodeNestloop.c 中实现：

其中 ExecNestLoop() 是执行嵌套循环连接，其余两个 ExecInitNestLoop() 和 ExecEndNestLoop() 则分别完成初始化和终止。

/*
* xht03：
* ExecNestLoop 是迭代器函数，不是一次执行完所有循环，
* 而是每次循环时执行一次的函数
*/
static TupleTableSlot *
ExecNestLoop(PlanState *pstate)
{
    /*
    * xht03：
    * node 指的是查询计划树中的一个节点。
    * 在这段代码中，NestLoopState *node 是
    * 一个指向 NestLoopState 结构体的指针，
    * 这个结构体通常包含了执行Nested Loop Join 所需要的状态信息。
    */
    NestLoopState *node = castNode(NestLoopState, pstate);
	NestLoop   *nl;
    
    /*
    * xht03：
    * Plan 指的是查询计划，
    * 它是数据库查询优化器生成的一种数据结构，
    * 用于描述如何执行一个SQL查询。
    * 查询计划是由一系列的操作步骤（或称为操作符）组成的树状结构，
    * 每一个操作步骤都对应查询计划树中的一个节点。
    */
	PlanState  *innerPlan;
	PlanState  *outerPlan;
    
    /*
    * xht03：
    * slot 意为“槽”
    * outerTupleSlot和innerTupleSlot是用来：
    * 存储从外部/内部关系（或称为左/右表）获取的当前元组
    */  
	TupleTableSlot *outerTupleSlot;
	TupleTableSlot *innerTupleSlot;
    
    // 连接条件
    // 其他条件（比如：where后面跟的表达式）
	ExprState  *joinqual;
	ExprState  *otherqual;
    
    // 存储执行过程中的上下文
	ExprContext *econtext;
	ListCell   *lc;
	
    // 检查中断（不用管）
	CHECK_FOR_INTERRUPTS();

	/*
	 * get information from the node
	 */
	ENL1_printf("getting info from node");

	nl = (NestLoop *) node->js.ps.plan;
	joinqual = node->js.joinqual;
	otherqual = node->js.ps.qual;
	outerPlan = outerPlanState(node);	// 记录外表所在的tuple
	innerPlan = innerPlanState(node);	// 记录内表所在的tuple
	econtext = node->js.ps.ps_ExprContext;

	/*
	 * Reset per-tuple memory context to free any expression evaluation
	 * storage allocated in the previous tuple cycle.
	 */
	ResetExprContext(econtext);

	/*
	 * Ok, everything is setup for the join so now loop until we return a
	 * qualifying join tuple.
	 */
	ENL1_printf("entering main loop");

	for (;;)
	{
		/*
		 * xht03:
		 * 当需要新的外部元组时，获取下一个的外部元组，
		 * 并让内表从头开始扫描（重置）
		 */
		if (node->nl_NeedNewOuter)
		{
			ENL1_printf("getting new outer tuple");
			outerTupleSlot = ExecProcNode(outerPlan);

			/*
			 * xht03：
			 * 如果下一个外部元组为空，则 join 完成了
			 */
			if (TupIsNull(outerTupleSlot))
			{
				ENL1_printf("no outer tuple, ending join");
				return NULL;
			}

			ENL1_printf("saving new outer tuple information");
			econtext->ecxt_outertuple = outerTupleSlot;
			node->nl_NeedNewOuter = false;
			node->nl_MatchedOuter = false;

			/*
			 * fetch the values of any outer Vars that must be passed to the
			 * inner scan, and store them in the appropriate PARAM_EXEC slots.
			 *
			 * xht03：
			 * 传递一些参数，不用管，也不要改
			 */
			foreach(lc, nl->nestParams)
			{
				NestLoopParam *nlp = (NestLoopParam *) lfirst(lc);
				int			paramno = nlp->paramno;
				ParamExecData *prm;

				prm = &(econtext->ecxt_param_exec_vals[paramno]);
				/* Param value should be an OUTER_VAR var */
				Assert(IsA(nlp->paramval, Var));
				Assert(nlp->paramval->varno == OUTER_VAR);
				Assert(nlp->paramval->varattno > 0);
				prm->value = slot_getattr(outerTupleSlot,
										  nlp->paramval->varattno,
										  &(prm->isnull));
				/* Flag parameter value as changed */
				innerPlan->chgParam = bms_add_member(innerPlan->chgParam,
													 paramno);
			}

			/*
			 * now rescan the inner plan
			 */
			ENL1_printf("rescanning inner plan");
			ExecReScan(innerPlan);
		}

		/*
		 * we have an outerTuple, try to get the next inner tuple.
		 */
		ENL1_printf("getting new inner tuple");

		innerTupleSlot = ExecProcNode(innerPlan);
		econtext->ecxt_innertuple = innerTupleSlot;

		if (TupIsNull(innerTupleSlot))
		{
			ENL1_printf("no inner tuple, need new outer tuple");

			node->nl_NeedNewOuter = true;

			if (!node->nl_MatchedOuter &&
				(node->js.jointype == JOIN_LEFT ||
				 node->js.jointype == JOIN_ANTI))
			{
				/*
				 * We are doing an outer join and there were no join matches
				 * for this outer tuple.  Generate a fake join tuple with
				 * nulls for the inner tuple, and return it if it passes the
				 * non-join quals.
				 * 
				 * xht03：
				 * 你可能也发现了：外部元组和内部元组的值，除了会存进 outerTupleSlot
				 * 和 innerTupleSlot，还要存进上下文 econtext 里。
				 * 这是因为：ExecQual() 判断是否满足等式条件的函数，是使用上下文中
				 * 存储的内外部元组信息进行判断的。
				 * 这样的函数还有很多，所以一定要记得更新上下文
				 */
				econtext->ecxt_innertuple = node->nl_NullInnerTupleSlot;

				ENL1_printf("testing qualification for outer-join tuple");
				
				if (otherqual == NULL || ExecQual(otherqual, econtext))
				{
					/*
					 * qualification was satisfied so we project and return
					 * the slot containing the result tuple using
					 * ExecProject().
					 */
					ENL1_printf("qualification succeeded, projecting tuple");
					return ExecProject(node->js.ps.ps_ProjInfo);
				}
				else
					InstrCountFiltered2(node, 1);
			}

			/*
			 * Otherwise just return to top of loop for a new outer tuple.
			 */
			continue;
		}

		/*
		 * at this point we have a new pair of inner and outer tuples so we
		 * test the inner and outer tuples to see if they satisfy the node's
		 * qualification.
		 *
		 * Only the joinquals determine MatchedOuter status, but all quals
		 * must pass to actually return the tuple.
		 */
		ENL1_printf("testing qualification");

		if (ExecQual(joinqual, econtext))
		{
			node->nl_MatchedOuter = true;

			/* In an antijoin, we never return a matched tuple */
			if (node->js.jointype == JOIN_ANTI)
			{
				node->nl_NeedNewOuter = true;
				continue;		/* return to top of loop */
			}

			/*
			 * If we only need to join to the first matching inner tuple, then
			 * consider returning this one, but after that continue with next
			 * outer tuple.
			 */
			if (node->js.single_match)
				node->nl_NeedNewOuter = true;

			if (otherqual == NULL || ExecQual(otherqual, econtext))
			{
				/*
				 * qualification was satisfied so we project and return the
				 * slot containing the result tuple using ExecProject().
				 */
				ENL1_printf("qualification succeeded, projecting tuple");

				return ExecProject(node->js.ps.ps_ProjInfo);
			}
			else
				InstrCountFiltered2(node, 1);
		}
		else
			InstrCountFiltered1(node, 1);

		/*
		 * Tuple fails qual, so free per-tuple memory and try again.
		 */
		ResetExprContext(econtext);

		ENL1_printf("qualification failed, looping");
	}
}

/* ----------------------------------------------------------------
 *		ExecInitNestLoop
 * ----------------------------------------------------------------
 */
NestLoopState *
ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
{
	NestLoopState *nlstate;

	/* check for unsupported flags */
	Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));

	NL1_printf("ExecInitNestLoop: %s\n",
			   "initializing node");

	/*
	 * create state structure
	 */
	nlstate = makeNode(NestLoopState);
	nlstate->js.ps.plan = (Plan *) node;
	nlstate->js.ps.state = estate;
	nlstate->js.ps.ExecProcNode = ExecNestLoop;

	/*
	 * Miscellaneous initialization
	 *
	 * create expression context for node
	 */
	ExecAssignExprContext(estate, &nlstate->js.ps);

	/*
	 * initialize child nodes
	 *
	 * If we have no parameters to pass into the inner rel from the outer,
	 * tell the inner child that cheap rescans would be good.  If we do have
	 * such parameters, then there is no point in REWIND support at all in the
	 * inner child, because it will always be rescanned with fresh parameter
	 * values.
	 */
	outerPlanState(nlstate) = ExecInitNode(outerPlan(node), estate, eflags);
	if (node->nestParams == NIL)
		eflags |= EXEC_FLAG_REWIND;
	else
		eflags &= ~EXEC_FLAG_REWIND;
	innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);

	/*
	 * Initialize result slot, type and projection.
	 */
	ExecInitResultTupleSlotTL(&nlstate->js.ps, &TTSOpsVirtual);
	ExecAssignProjectionInfo(&nlstate->js.ps, NULL);

	/*
	 * initialize child expressions
	 */
	nlstate->js.ps.qual =
		ExecInitQual(node->join.plan.qual, (PlanState *) nlstate);
	nlstate->js.jointype = node->join.jointype;
	nlstate->js.joinqual =
		ExecInitQual(node->join.joinqual, (PlanState *) nlstate);

	/*
	 * detect whether we need only consider the first matching inner tuple
	 */
	nlstate->js.single_match = (node->join.inner_unique ||
								node->join.jointype == JOIN_SEMI);

	/* set up null tuples for outer joins, if needed */
	switch (node->join.jointype)
	{
		case JOIN_INNER:
		case JOIN_SEMI:
			break;
		case JOIN_LEFT:
		case JOIN_ANTI:
			nlstate->nl_NullInnerTupleSlot =
				ExecInitNullTupleSlot(estate,
									  ExecGetResultType(innerPlanState(nlstate)),
									  &TTSOpsVirtual);
			break;
		default:
			elog(ERROR, "unrecognized join type: %d",
				 (int) node->join.jointype);
	}

	/*
	 * finally, wipe the current outer tuple clean.
	 */
	nlstate->nl_NeedNewOuter = true;
	nlstate->nl_MatchedOuter = false;

	NL1_printf("ExecInitNestLoop: %s\n",
			   "node initialized");

	return nlstate;
}

/* ----------------------------------------------------------------
 *		ExecEndNestLoop
 *
 *		closes down scans and frees allocated storage
 * ----------------------------------------------------------------
 */
void
ExecEndNestLoop(NestLoopState *node)
{
	NL1_printf("ExecEndNestLoop: %s\n",
			   "ending node processing");

	/*
	 * Free the exprcontext
	 */
	ExecFreeExprContext(&node->js.ps);

	/*
	 * clean out the tuple table
	 */
	ExecClearTuple(node->js.ps.ps_ResultTupleSlot);

	/*
	 * close down subplans
	 */
	ExecEndNode(outerPlanState(node));
	ExecEndNode(innerPlanState(node));

	NL1_printf("ExecEndNestLoop: %s\n",
			   "node processing ended");
}

/* ----------------------------------------------------------------
 *		ExecReScanNestLoop
 * ----------------------------------------------------------------
 */
void
ExecReScanNestLoop(NestLoopState *node)
{
	PlanState  *outerPlan = outerPlanState(node);

	/*
	 * If outerPlan->chgParam is not null then plan will be automatically
	 * re-scanned by first ExecProcNode.
	 */
	if (outerPlan->chgParam == NULL)
		ExecReScan(outerPlan);

	/*
	 * innerPlan is re-scanned for each new outer tuple and MUST NOT be
	 * re-scanned from here or you'll get troubles from inner index scans when
	 * outer Vars are used as run-time keys...
	 */

	node->nl_NeedNewOuter = true;
	node->nl_MatchedOuter = false;
}

源代码修改

我们主要做了一下修改：

新增了结构体类型 NestedBlock ，用于存放中间缓存的外部元组。

与之相应地：

增加了 GetNestedBlock() 函数，用于获取新的中间元组。
在上下文类型 ExprContext 中增加 NestedBlock 变量，将中间缓存元组记录在上下文中。
修改 ExecNestLoop() 函数：外部元组一次取出至多 NESTED_BLOCK_SIZE 个元组放入缓存 NestedBlock 中，并照旧扫描内部元组。但每次需要将缓存中的所有 outer tuple 一一与 inner tuple 进行比较，且外部元组的指针需要往后移动 NESTED_BLOCK_SIZE 个元组。

修改细节如下（修改之处都加有 xht03 注释）：

execnodes.h

// xht03
#define NESTED_BLOCK_SIZE 64

typedef struct NestedBlock{
    TupleTableSlot *tuple[NESTED_BLOCK_SIZE];	// 存放外部元组
	bool isMatched[NESTED_BLOCK_SIZE];			// 是否在内部关系中匹配到过
    int size;									// 缓存中的元组个数
} NestedBlock;


typedef struct ExprContext
{
	NodeTag		type;

	/* Tuples that Var nodes in expression may refer to */
#define FIELDNO_EXPRCONTEXT_SCANTUPLE 1
	TupleTableSlot *ecxt_scantuple;
#define FIELDNO_EXPRCONTEXT_INNERTUPLE 2
	TupleTableSlot *ecxt_innertuple;
#define FIELDNO_EXPRCONTEXT_OUTERTUPLE 3
	TupleTableSlot *ecxt_outertuple;

// xht03
	NestedBlock *ecxt_block;

	/* Memory contexts for expression evaluation --- see notes above */
	MemoryContext ecxt_per_query_memory;
	MemoryContext ecxt_per_tuple_memory;

	/* Values to substitute for Param nodes in expression */
	ParamExecData *ecxt_param_exec_vals;	/* for PARAM_EXEC params */
	ParamListInfo ecxt_param_list_info; /* for other param types */

	/*
	 * Values to substitute for Aggref nodes in the expressions of an Agg
	 * node, or for WindowFunc nodes within a WindowAgg node.
	 */
#define FIELDNO_EXPRCONTEXT_AGGVALUES 8
	Datum	   *ecxt_aggvalues; /* precomputed values for aggs/windowfuncs */
#define FIELDNO_EXPRCONTEXT_AGGNULLS 9
	bool	   *ecxt_aggnulls;	/* null flags for aggs/windowfuncs */

	/* Value to substitute for CaseTestExpr nodes in expression */
#define FIELDNO_EXPRCONTEXT_CASEDATUM 10
	Datum		caseValue_datum;
#define FIELDNO_EXPRCONTEXT_CASENULL 11
	bool		caseValue_isNull;

	/* Value to substitute for CoerceToDomainValue nodes in expression */
#define FIELDNO_EXPRCONTEXT_DOMAINDATUM 12
	Datum		domainValue_datum;
#define FIELDNO_EXPRCONTEXT_DOMAINNULL 13
	bool		domainValue_isNull;

	/* Link to containing EState (NULL if a standalone ExprContext) */
	struct EState *ecxt_estate;

	/* Functions to call back when ExprContext is shut down or rescanned */
	ExprContext_CB *ecxt_callbacks;
} ExprContext;

nodeNestloop.h

#ifndef NODENESTLOOP_H
#define NODENESTLOOP_H

#include "nodes/execnodes.h"


// xht03
extern void *GetNestedBlock(NestedBlock *block, PlanState *outerPlan);


extern NestLoopState *ExecInitNestLoop(NestLoop *node, EState *estate, int eflags);
extern void ExecEndNestLoop(NestLoopState *node);
extern void ExecReScanNestLoop(NestLoopState *node);

#endif							/* NODENESTLOOP_H */

nodeNestloop.c

// xht03
void *GetNestedBlock(NestedBlock *block, PlanState *outerPlan)
{
    for(int i = 0; i < NESTED_BLOCK_SIZE; i++)
    {
        block->tuple[i] = ExecProcNode(outerPlan);
		block->isMatched[i] = false;
        if(TupIsNull(block->tuple[i]))
        {
            block->size = i;
            break;
        }
    }
}



static TupleTableSlot *
ExecNestLoop(PlanState *pstate)	
{
	NestLoopState *node = castNode(NestLoopState, pstate);
	NestLoop   *nl;
	PlanState  *innerPlan;
	PlanState  *outerPlan;
	TupleTableSlot *outerTupleSlot;
	TupleTableSlot *innerTupleSlot;
	ExprState  *joinqual;
	ExprState  *otherqual;
	ExprContext *econtext;
	ListCell   *lc;

	// xht03
	NestedBlock block;


	CHECK_FOR_INTERRUPTS();	


	/*
	 * get information from the node
	 */
	ENL1_printf("getting info from node");

	nl = (NestLoop *) node->js.ps.plan;
	joinqual = node->js.joinqual;
	otherqual = node->js.ps.qual;
	outerPlan = outerPlanState(node);
	innerPlan = innerPlanState(node);
	econtext = node->js.ps.ps_ExprContext;

	/*
	 * Reset per-tuple memory context to free any expression evaluation
	 * storage allocated in the previous tuple cycle.
	 */
	ResetExprContext(econtext);

	/*
	 * Ok, everything is setup for the join so now loop until we return a
	 * qualifying join tuple.
	 */
	ENL1_printf("entering main loop");

	for (;;)
	{
		/*
		 * If we don't have an outer tuple, get the next one and reset the
		 * inner scan.
		 */
		if (node->nl_NeedNewOuter)
		{
			ENL1_printf("getting new outer tuple");
			// xht03
			GetNestedBlock(&block, outerPlan);


			/*
		 	 * if there are no more outer tuples, then the join is complete..
			 */
			// xht03
			if (block.size == 0)
			{
				ENL1_printf("no outer tuple, ending join");
				return NULL;
			}

			ENL1_printf("saving new outer tuple information");
			//econtext->ecxt_outertuple = outerTupleSlot;	// 记录外表所在的tuple，更新上下文
			
			// xht03
			outerTupleSlot = block.tuple[block.size - 1];
			econtext->ecxt_outertuple = outerTupleSlot;
			econtext->ecxt_block = &block;
			node->nl_NeedNewOuter = false;
			node->nl_MatchedOuter = false;
            
			/*
			 * fetch the values of any outer Vars that must be passed to the
			 * inner scan, and store them in the appropriate PARAM_EXEC slots.
			 */
            // 将外表的一些参数传递给内表
			foreach(lc, nl->nestParams)
			{
				NestLoopParam *nlp = (NestLoopParam *) lfirst(lc);
				int			paramno = nlp->paramno;
				ParamExecData *prm;
				prm = &(econtext->ecxt_param_exec_vals[paramno]);
				/* Param value should be an OUTER_VAR var */
				Assert(IsA(nlp->paramval, Var));
				Assert(nlp->paramval->varno == OUTER_VAR);
				Assert(nlp->paramval->varattno > 0);
				prm->value = slot_getattr(outerTupleSlot,
										  nlp->paramval->varattno,
										  &(prm->isnull));
				/* Flag parameter value as changed */
				innerPlan->chgParam = bms_add_member(innerPlan->chgParam,
													 paramno);
			}
			/*
			 * now rescan the inner plan
			 */
            // 重新确定内表从哪里开始扫描
			ENL1_printf("rescanning inner plan");
			ExecReScan(innerPlan);
		}


		/*
		 * we have an outerTuple, try to get the next inner tuple.
		 */
		ENL1_printf("getting new inner tuple");
		innerTupleSlot = ExecProcNode(innerPlan);		// 获取新的内部节点（tuple）
		econtext->ecxt_innertuple = innerTupleSlot;		// 记录内表所在的tuple，更新上下文

		if (TupIsNull(innerTupleSlot))
		{
			ENL1_printf("no inner tuple, need new outer tuple");
			node->nl_NeedNewOuter = true;
			/*
			* xht03:
			* If not all the tuples in the block have been matched,
			* or the join type is left-join or anti-join
			*/
			if (!node->nl_MatchedOuter &&
				(node->js.jointype == JOIN_LEFT ||
				 node->js.jointype == JOIN_ANTI))
			{
				/*
				 * We are doing an outer join and there were no join matches
				 * for this outer tuple.  Generate a fake join tuple with
				 * nulls for the inner tuple, and return it if it passes the
				 * non-join quals.
				 */

				for(int i = 0; i < block.size; i++){
					econtext->ecxt_outertuple = block.tuple[i];
					if(!block.isMatched[i]){
                        // set to a empty tuple
						econtext->ecxt_innertuple = node->nl_NullInnerTupleSlot;
                        
                        // 检查非连接条件（non-join quals）的其他条件是否满足
						ENL1_printf("testing qualification for outer-join tuple");
						if (otherqual == NULL || ExecQual(otherqual, econtext))
						{
							/*
						 	* qualification was satisfied so we project and return
						 	* the slot containing the result tuple using
						 	* ExecProject().
						 	*/
							ENL1_printf("qualification succeeded, projecting tuple");
							return ExecProject(node->js.ps.ps_ProjInfo);
						}
						else
							InstrCountFiltered2(node, 1);
					}
				}
			}
			/*
			 * Otherwise just return to top of loop for a new outer tuple.
			 */
			continue;
		}
		/*
		 * at this point we have a new pair of inner and outer tuples so we
		 * test the inner and outer tuples to see if they satisfy the node's
		 * qualification.
		 *
		 * Only the joinquals determine MatchedOuter status, but all quals
		 * must pass to actually return the tuple.
		 */
		ENL1_printf("testing qualification");
		
		// 测试是否满足 join 条件
		// xht03
		int matched_num = 0;

		for(int i = 0; i < block.size; i++){
			econtext->ecxt_outertuple = block.tuple[i];
			if (ExecQual(joinqual, econtext))
			{
				//xht03
				if(!block.isMatched[i]){
					matched_num++;
					block.isMatched[i] = true;
				}

				/*
				* If we only need to join to the first matching inner tuple, then
				* consider returning this one, but after that continue with next
				* outer tuple.
				*/
			
				if (otherqual == NULL || ExecQual(otherqual, econtext))
				{
					/*
					* qualification was satisfied so we project and return the
					* slot containing the result tuple using ExecProject().
					*/
					ENL1_printf("qualification succeeded, projecting tuple");
					return ExecProject(node->js.ps.ps_ProjInfo);
				}
				else
					InstrCountFiltered2(node, 1);
			}
			else
				InstrCountFiltered1(node, 1);
		}
		/*
		* xht03:
		* If all tuples in the block have been matched
		*/
		if(matched_num == block.size)
		{
			node->nl_MatchedOuter = true;
			/* In an antijoin, we never return a matched tuple */
			if (node->js.jointype == JOIN_ANTI || node->js.single_match)
			{
				node->nl_NeedNewOuter = true;
			}	
		}

			
		/*
		 * Tuple fails qual, so free per-tuple memory and try again.
		 */
		ResetExprContext(econtext);
		ENL1_printf("qualification failed, looping");
	}
}

源代码测试

重新编译并安装 postgresql 后，设置 NESTED_BLOCK_SIZE = 1、2、8、64、128、1024，运行下面语句两次，取第二次运行时间。

在数据库命令行中输入：\timing，打开测时间功能。

1	SELECT count(*) FROM restaurantaddress ra, restaurantphone rp WHERE ra.name = rp.name;

NESTED_BLOCK_SIZE	时间1/ms	时间2/ms	时间3/ms
1	2.710	2.559	2.565
2	2.579	2.753	2.636
8	2.337	2.483	2.716
64	2.277	2.441	1.896
128	2.657	2.543	2.464
1024	2.871	2.743	2.778

这样的实验结果是符合预期的：

NESTED_BLOCK_SIZE 太小时，与 Simple Nested-Loop 相近；当 NESTED_BLOCK_SIZE 太大时，就相当于内外表互换后的 Simple Nested-Loop。所以当 NESTED_BLOCK_SIZE 取一个大小适中的值时，Block Nested-Loop 才能性能最大化。