Ceres(二) 自动求导

头文件只有一个"ceres/ceres.h"

调试几个经典例子的过程

1
2
3
4
5
6
7
#include "ceres/ceres.h"

using ceres::AutoDiffCostFunction;
using ceres::CostFunction;
using ceres::Problem;
using ceres::Solver;
using ceres::Solve;


1
2
3
4
5
6
7
8
9
10
11
struct  Functor
{
template <typename T> bool operator()(const T* const x, T* residual) const
{
residual[0] = 10.0 - x[0];
residual[1] = 3.0 - x[1];
// 直接让第2维度为0,优化后的x[1]还是初值2
// residual[1] = T(0.0);
return true;
}
};

这里的变量为二维的,目标函数为10.0-x3.0-x

如果这里不写10.0而是10,就会报错。 Ceres中没有int,只接受double

对于residual[1] = T(0.0);的情况,如果不加T,编译就会报错 error: no match for ‘operator=’ (operand types are ‘ceres::Jet’ and ‘double’) . Jet类型是ceres内置类型,我们要把double转成Jet类型,也就是T(0.0)

1
2
3
4
5
6
7
8
struct  TestLocalParam
{
template<typename T> bool operator()(const T* x, const T* delta, T* x_plus_delta) const{
x_plus_delta[0] = x[0] + 0.4 * delta[0];
x_plus_delta[1] = x[1] + 0.4 * delta[1];
return true;
}
};

CostFunctoroperator就是计算残差,不带平方项,平方是ceres自动添加,相当于最小二乘的函数 f

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// double x = 0.5;
double x[] = {0.5, 2.0};

Problem problem;

problem.AddParameterBlock(x, 2,
(new ceres::AutoDiffLocalParameterization
// 真正的维度是2, 参与优化的维度是1
<TestLocalParam, 2, 1> ())
);
// AddParameterBlock 修改之后,无论下面的AutoDiffCostFunction参数是多少,只优化第1维度


// 残差的维度,参数的维度
// 如果此时是 2,1. 报错
CostFunction* cost_func = new AutoDiffCostFunction<Functor, 2, 2>(new Functor);
// 参数:CostFunction,可选的LossFunction,将CostFunction连接到一组参数块
problem.AddResidualBlock(cost_func, nullptr, x);

创建最小二乘问题,可以使用Problem::AddResidualBlock()Problem::AddParameterBlock()。 可以使用Problem::AddParameterBlock()显式添加参数块,这将导致额外的正确性检查; 然而,如果参数块不存在,Problem::AddResidualBlock()会隐式添加参数块。

AddParameterBlock()显式地向Problem添加了一个参数块。它还允许用户将LocalParameterization对象与参数块关联起来。

AddResidualBlock默认会先调用AddParameterBlock,一般不用后者。

损失函数LossFunction,给异常值做限定,如果是nullptr,该项的代价就是残差的平方范数

代价函数携带关于它所期望的参数块大小的信息。函数检查这些参数是否与parameter_blocks中列出的参数块的大小匹配。如果检测到不匹配,程序将中止。

Solver

1
2
3
4
5
6
7
8
9
10
11
12
13
14
  Solver::Options  option;
option.minimizer_progress_to_stdout = true;
option.linear_solver_type = ceres::SPARSE_NORMAL_CHOLESKY;

Solver::Summary summary;
Solve(option, &problem, &summary);

std::cout << summary.BriefReport() << std::endl<<std::endl;
// std::cout << summary.FullReport() << std::endl<<std::endl;

// std::cout << "initial x: " << initial_x <<" x: "<< x << std::endl;
std::cout <<"x[0]: "<< x[0] <<" x[1]: "<< x[1]<< std::endl;
return 0;
}

优先选用自动微分算法,某些情况可能需要用到解析微分算法,尽量避免数值微分算法。

曲线拟合

使用十四讲第六章的曲线拟合的例子,已知 N 个数据 ,它们符合曲线 ,通过曲线拟合求 a, b, c

跟上面的例子不同,这次目标函数为 残差维度为3(未知a,b,c)

CMake配置如下

1
2
3
4
5
6
7
8
find_package(OpenCV REQUIRED)
INCLUDE_DIRECTORIES(${OpenCV_DIRS})

target_link_libraries(node_2
${catkin_LIBRARIES}
${CERES_LIBRARIES}
${OpenCV_LIBS}
)

先生成带有高斯噪声的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <opencv2/opencv.hpp>

// 真实值
double a =1.0, b=2.0, c=1.0;
// abc的初值
double abc[3] = {0,0,0};

// 生成带有噪声的模拟数据
vector<double> x_data, y_data;
int N = 240; // 数据数量
double w_sigma = 1.0; // 高斯标准差
cv::RNG rng;
for(unsigned int i =0; i<N; i++)
{
double x = i/100.0;
x_data.push_back(x);
y_data.push_back(exp(a*x*x + b*x + c) + rng.gaussian(w_sigma) );
}

高斯噪声的产生: cv::RNG rng; rng.gaussian(w_sigma)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct Functor
{
Functor(double x, double y):
m_x(x), m_y(y) { }
#if 0
// 开始我是这么写的,没弄清残差的维度
template<typename T> bool operator()( const T* const a,
const T* const b,
const T* const c,
T* residual) const
{
residual[0] = m_y - exp(a[0] * m_x * m_x + b[0] * m_x + c[0]);
return true;
}
#endif

// 因为要传入数据(x,y), 所以要定义两个成员变量
template<typename T> bool operator()( const T* const abc, T* residual) const
{
residual[0] = m_y - exp(abc[0] * m_x * m_x + abc[1] * m_x + abc[2]);
return true;
}
const double m_x, m_y;
};

因为自变量x是一维的,而不是向量,所以残差只计算一维 residual[0]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Problem problem;
for(unsigned int i=0; i<N; i++)
{
// 残差维度, 所估计变量的维度
CostFunction* cost_func = new AutoDiffCostFunction<Functor,1,3>(new Functor(x_data[i], y_data[i]) );
problem.AddResidualBlock(cost_func, nullptr, abc);
}
Solver::Options option;
option.minimizer_progress_to_stdout = true;
option.linear_solver_type = ceres::SPARSE_NORMAL_CHOLESKY;

Solver::Summary summary;
Solve(option, &problem, &summary);

cout << summary.BriefReport() << endl <<endl;
cout << "a: "<< abc[0] <<" b: "<< abc[1]<< " c: "<< abc[2] <<endl;

参考:
Curve Fitting(曲线拟合)
Problem类
资料


Hector

hector相当于cartographer的前端,没有闭环检测,利用高斯牛顿方法来解决scan-matching的问题。它要求激光雷达的扫描频率比较高,建图时机器人速度需要很慢,但在机器人快速转弯时容易发生错误,原因在于优化算法容易陷入局部最小值。使用了多分辨率地图,也可以没有里程计。

论文的新颖点在于scanMatch使用高斯牛顿法以及多分辨率地图,后来被cartographer借鉴。

对向量求导.png

缺点

  1. hector的计算太慢了,只进行扫描匹配的操作就要花费大概0.1秒钟,这就导致即使雷达频率再高,hector也处理不过来
  2. hector的地图不能自动更改大小,地图的大小在初始化之后始终是固定的
  3. 适用于非光滑线性逼近的地图梯度∇M(Si(ε)),这意味着不能保证局部二次收敛到最小,但该算法在实践中精度较高。

参考:
李想对hector的分析
论文阅读:hector_slam


cartographer_grpc_server的使用和配置

使用gRPC与云端结合:cartographer使用Protobuf数据传输协议,所以可以使用gRPC(也是谷歌的),可以通过云端运行cartographer算法,最典型的例子就是多机器人在一张已知地图上进行导航,在一个远程的强大centralized localization server运行 multi-trajectories Cartographer

方案:一台机器运行local optimization,另一台运行 global optimization

Cloud-based mapping ,using a gRPC streaming connection between local and global SLAM to propagate sensor data and local SLAM updates, which does not support ACKs and resending, hence this is not going to work over spotty WiFi.

先安装protobufceres

安装 grpc

一开始下载grpc后进行编译,结果报错:

1
2
3
4
CMake Warning at cmake/abseil-cpp.cmake:30 (message):
gRPC_ABSL_PROVIDER is "module" but ABSL_ROOT_DIR is wrong
Call Stack (most recent call first):
CMakeLists.txt:188 (include)

这说明第三方库没准备好,最好按如下步骤:

如果只想下载指定版本的,如以版本1.27.3为例,可改成如下语句:

1
git clone -b v1.27.0 https://github.com/grpc/grpc.git

上列操作成功完成后,gRPC 源码的第三方依赖目录 third_party 实际是空的,需通过下列步骤拉取依赖的第三方。切换到 grpc 目录,下载 grpc 第三方依赖到本地:
1
git submodule update --init

grpc依赖的第三方库有点多,如果不借助git submodule,手工一个个下载不太容易。其中的bloaty库很大。

注意,gRPC 要求 CMake 3.5.1 或以上版本的CMake,否则会报错 CMake 3.5.1 or higher is required

编译:

1
2
3
4
5
6
7
8
9
10
11
cmake -DgRPC_SSL_PROVIDER=package ..
make -j8
sudo make install

cmake -DgRPC_INSTALL=ON -DgRPC_ZLIB_PROVIDER=package -DgRPC_CARES_PROVIDER=package -DgRPC_PROTOBUF_PROVIDER=package -DgRPC_SSL_PROVIDER=package -DgRPC_ABSL_PROVIDER=package ..
make -j8
sudo make install

cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_ZLIB_PROVIDER=package -DgRPC_CARES_PROVIDER=package -DgRPC_PROTOBUF_PROVIDER=package -DgRPC_SSL_PROVIDER=package -DgRPC_ABSL_PROVIDER=package ..
make -j8
sudo make install

CMake配置

在包含option(BUILD_GRPC "build Cartographer gRPC support" false)的3个CMakeLists修改为true。 最好再添加 set(BUILD_GRPC true),以及C++14的支持 set(CMAKE_CXX_STANDARD 14)

lua配置

backpack_2d_server.lua如下:

1
2
3
4
include "map_builder_server.lua"

MAP_BUILDER_SERVER.map_builder.use_trajectory_builder_2d = true
return MAP_BUILDER_SERVER

map_builder_server.lua如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
include "map_builder.lua"

MAP_BUILDER_SERVER = {
map_builder = MAP_BUILDER,
num_event_threads = 4,
num_grpc_threads = 4,
--- 我的修改
server_address = "192.168.0.105:50051",
uplink_server_address = "",
upload_batch_size = 100,
enable_ssl_encryption = false,
enable_google_auth = false,
}

运行数据集

1
roslaunch cartographer_ros grpc_demo_backpack_2d.launch bag_filename:=/home/user/cartographer/dataset/cartographer_paper_deutsches_museum.bag

这是在一台机上运行客户端和服务端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<launch>
<param name="/use_sim_time" value="true" />
<param name="robot_description"
textfile="$(find cartographer_ros)/urdf/backpack_2d.urdf" />

<node name="robot_state_publisher" pkg="robot_state_publisher"
type="robot_state_publisher" />
<!-- 服务端 -->
<node name="cartographer_grpc_server" pkg="cartographer_ros"
type="cartographer_grpc_server.sh" args="
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename backpack_2d_server.lua">
</node>
<!-- 客户端 -->
<node name="cartographer_grpc_node" pkg="cartographer_ros"
type="cartographer_grpc_node" args="
-client_id CLIENT_ID
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename backpack_2d.lua"
output="screen">
<remap from="echoes" to="horizontal_laser_2d" />
</node>

<node name="playbag" pkg="rosbag" type="play"
args="--clock $(arg bag_filename)" />
<node name="rviz" pkg="rviz" type="rviz" required="true"
args="-d $(find cartographer_ros)/configuration_files/demo_2d.rviz" />
</launch>

就是CLIENT_ID,无需设置为数字。

真实运行

  • 把客户端上的ROS_MASTER_URI设置为服务端的IP
  • rviz放在服务端。
  • 服务端的终端刷新日志,客户端不刷新
  • 网络不能差,否则服务端刷新慢。
  • 连接成功后,关服务端,客户端也会终止。连接成功后,关客户端,服务端停止刷新;再启动客户端,服务端继续更新。

客户端

1
2
3
4
5
6
7
8
9
10
11
12
<launch>
<node name="cartographer_grpc_node" pkg="cartographer_ros"
type="cartographer_grpc_node" args="
-client_id CLIENT_ID
-server_address 192.168.1.213:50051
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename backpack_2d.lua"
output="screen">
<remap from="/scan" to="/sick_tim551_scan" />
<remap from="/odom" to="/odometry/filtered" />
</node>
</launch>

服务端

1
2
3
4
5
6
7
8
9
10
<launch>
<node name="cartographer_grpc_server" pkg="cartographer_ros"
type="cartographer_grpc_server.sh" args="
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename backpack_2d_server.lua">
</node>

<node name="rviz" pkg="rviz" type="rviz" required="true"
args="-d $(find cartographer_ros)/configuration_files/demo_2d.rviz" />
</launch>

纯定位

1
2
3
4
5
6
7
8
9
<node name="cartographer_grpc_node" pkg="cartographer_ros"
type="cartographer_grpc_node" args="
-client_id CLIENT_ID
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename backpack_2d_localization.lua
-load_state_filename $(arg load_state_filename)"
output="screen">
<remap from="echoes" to="horizontal_laser_2d" />
</node>

参考:
运行cartographer的gRPC demo
Ubuntu下安装c-ares库


如何打印当前使用的lua参数

这里就是使用cartographer_print_configuration,它输出了当前cartographer使用的所有lua参数,虽然不如rosparam强大,但也是一种方法,本质就是读取加载的lua文件。

代码比较简单,不用太过研究,直接写launch

1
2
3
4
5
6
7
8
9
<launch>
<node name="print_mit_config" pkg="cartographer_ros"
type="cartographer_print_configuration" args="
-configuration_directories $(find cartographer_ros)/configuration_files
-configuration_basename mit_stata_odom.lua
-subdictionary map_builder"
output="screen">
</node>
</launch>

运行后会报错,找不到cartographer_print_configuration,这是因为可执行文件在bin路径: /home/user/carto_ws/install_isolated/bin/cartographer_print_configuration,roslaunch找的是/home/user/carto_ws/install_isolated/lib/cartographer_ros路径下的文件,复制到这里就好了。

运行launch就会输出所有lua参数,自己设置的参数也会覆盖默认的。 另外还有个subdictionary参数,输出的是子参数,这里用map_builder。但是只能输出第一级的子参数,如果自参数太底层就会报错,比如submaps做子参数就会运行错误。


点云地图
  • 去除噪点:激光雷达扫的点云有很多噪点,需要用PCL移除噪点。有的是孤立点,有的是激光打到边缘产生的衍射现象,衍射产生的点也要移除。

  • 降采样:激光的数据量很大,需要用Voxel进行降采样

  • 移除行人: 动态障碍,没必要采用

  • 移除地面: 地面不是重要的参考,也要移除

  • 点云变换和增量式注册:

  • mesh化: 点云地图建好后,把点云用边连起来,形成三维模型,再用Matlab进行处理


cartographer_occupancy_grid_node获取地图

Rviz中的地图绘制使用了cairo库。注意gflag的参数include_frozen_submapsinclude_unfrozen_submaps不能都为false,否则会报错

cartographerrviz的绘制方式,就是`submap_node控制一个子图的绘制,每个子图有一个子图原点,这个是submap_query中的slice_pose。而每个子图之间是有微弱旋转的,而控制旋转的中心也是这个slice_pose。而每个子图之间的旋转和平移可以通过submap_list`的pose得到。

main函数值得看的就一句::cartographer_ros::Node node(FLAGS_resolution, FLAGS_publish_period_sec);,构造函数如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Node::Node(const double resolution, const double publish_period_sec)
: resolution_(resolution),
// submap_query 服务的客户端
client_(node_handle_.serviceClient<::cartographer_ros_msgs::SubmapQuery>(
kSubmapQueryServiceName)),
// submap_list话题的订阅者,回调函数 HandleSubmapList
submap_list_subscriber_(node_handle_.subscribe(
kSubmapListTopic, kLatestOnlyPublisherQueueSize,
boost::function<void(
const cartographer_ros_msgs::SubmapList::ConstPtr&)>(
[this](const cartographer_ros_msgs::SubmapList::ConstPtr& msg) {
HandleSubmapList(msg);
}))),
// 发布话题 map
occupancy_grid_publisher_(
node_handle_.advertise<::nav_msgs::OccupancyGrid>(
kOccupancyGridTopic, kLatestOnlyPublisherQueueSize,
true /* latched */)),
// timer执行DrawAndPublish函数
occupancy_grid_publisher_timer_(
node_handle_.createWallTimer(::ros::WallDuration(publish_period_sec),
&Node::DrawAndPublish, this)) {}

Node::DrawAndPublish是画图和发布map,不是重点


HandleSubmapList函数大部分都很简单,终点是下面这句:

1
2
auto fetched_textures =
::cartographer_ros::FetchSubmapTextures(id, &client_);

内部其实就是submap_query的客户端发起请求,服务端的注册在Node的构造函数里,回调是Node::HandleSubmapQuery

代码调用顺序是: Node::HandleSubmapQuery —— MapBuilderBridge::HandleSubmapQuery —— MapBuilder::SubmapToProto —— PoseGraph2D::GetSubmapDataSubmap2D::ToResponseProto —— ProbabilityGrid::DrawToSubmapTexture

重点:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// 这里比较简单,最终返回的是后端传来的 global坐标系下的位姿
PoseGraphInterface::SubmapData PoseGraph2D::GetSubmapDataUnderLock(
const SubmapId& submap_id) const
{
const auto it = data_.submap_data.find(submap_id);
if (it == data_.submap_data.end())
return {};
auto submap = it->data.submap;
if (data_.global_submap_poses_2d.Contains(submap_id) )
{
// We already have an optimized pose.
return {submap,
transform::Embed3D(
data_.global_submap_poses_2d.at(submap_id).global_pose) };
}
// We have to extrapolate.
return {submap, ComputeLocalToGlobalTransform(data_.global_submap_poses_2d,
submap_id.trajectory_id) *
submap->local_pose() };
}

void Submap2D::ToResponseProto(
const transform::Rigid3d&,
proto::SubmapQuery::Response* const response) const
{
if (!grid_) return;
// 这个 version 是子图插入了多少雷达数据,完成的子图是 2* num_range_data
response->set_submap_version( num_range_data() );
proto::SubmapQuery::Response::SubmapTexture* const texture =
response->add_textures();
grid()->DrawToSubmapTexture(texture, local_pose());
}

再看DrawToSubmapTexture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
bool ProbabilityGrid::DrawToSubmapTexture(
proto::SubmapQuery::Response::SubmapTexture* const texture,
transform::Rigid3d local_pose) const
{
Eigen::Array2i offset;
CellLimits cell_limits;
ComputeCroppedLimits(&offset, &cell_limits);

std::string cells;
for (const Eigen::Array2i& xy_index : XYIndexRangeIterator(cell_limits))
{
if (!IsKnown(xy_index + offset)) {
cells.push_back(0 /* unknown log odds value */);
cells.push_back(0 /* alpha */);
continue;
}
// We would like to add 'delta' but this is not possible using a value and
// alpha. We use premultiplied alpha, so when 'delta' is positive we can
// add it by setting 'alpha' to zero. If it is negative, we set 'value' to
// zero, and use 'alpha' to subtract. This is only correct when the pixel
// is currently white, so walls will look too gray. This should be hard to
// detect visually for the user, though.

// delta范围[-127, 127]
const int delta =
128 - ProbabilityToLogOddsInteger(GetProbability(xy_index + offset));
const uint8 alpha = delta > 0 ? 0 : -delta;
const uint8 value = delta > 0 ? delta : 0;
// 存数据有2个值,栅格值value和alpha透明度
cells.push_back(value);
cells.push_back((value || alpha) ? alpha : 1);
}
// 保存地图栅格数据时进行压缩
common::FastGzipString(cells, texture->mutable_cells());
// 地图描述信息赋值,省略
......
*texture->mutable_slice_pose() = transform::ToProto(
local_pose.inverse() *
transform::Rigid3d::Translation(Eigen::Vector3d(max_x, max_y, 0.)) );

return true;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// 解压后的栅格数据
struct SubmapTexture {
struct Pixels {
std::vector<char> intensity; // 栅格值
std::vector<char> alpha; // 透明度
};
Pixels pixels;
int width;
int height;
double resolution;
::cartographer::transform::Rigid3d slice_pose;
};

// 压缩后的栅格数据
struct SubmapTextures {
int version;
std::vector<SubmapTexture> textures;
};

MIT和软件所的数据集

根据我的经验,先启动cartographer后,等待出现下面的日志,然后再播放bag。如果这个日志没出现,播放bag通常不会成功开始建图

1
2
[/cartographer_node   node.cc:886] Expected topic "scan" (trajectory 0) (resolved topic "/base_scan") but no publisher is currently active.
[/cartographer_node node.cc:895] Currently available topics are: /constraint_list,/submap_list,/scan_matched_points2,/rosout,/tf,/rosout_agg,/map,/trajectory_node_list,/landmark_poses_list,

MIT数据集老出问题,德意志博物馆数据集却从不出问题 最好数据集里不要有tf信息
博物馆的tf
MIT数据集的 tf

博物馆的地图大小 3066平方米

MIT的数据集运行

这个数据集使用雷达北洋UTM-30LX,也就是LOAM论文所用的雷达,有一个问题是它的机器人是会坐电梯在楼层之间跑的,因此需要选择一个只在一个平面上跑的数据集2011-03-28-08-38-59.bag

从TF变换树可以看到,这里的里程计信息是采用经过EKF滤波之后的里程计坐标系/robot_pose_ekf/odom_combined,而不是直接的base_odometry/odom第2个问题来了:cartographer需要的里程计消息是nav_msgs/Odometry,但是/robot_pose_ekf/odom_combined的类型是geometry_msgs/PoseWithCovarianceStamped,在播放rosbag时会报错,无法建图。于是想用原来的里程计/base_odometry/odom,结果发现缺少child_frame_id,一般为base_link。无法给bag消息加坐标系,只能写一个程序mit_to_odom做转换

我们要使用数据集的IMU和odom,否则建图不能进行,日志如下

1
2
3
4
[/cartographer_node] [local_trajectory_builder_2d.cc:136] time_first_point: 634369235390529797
[/cartographer_node] [local_trajectory_builder_2d.cc:137] GetLastPoseTime: 634369235390647384
[/cartographer_node] [local_trajectory_builder_2d.cc:138] Extrapolator is still initializing
[/cartographer_node] [pose_graph_2d.cc:148] Inserted submap (0, 0).

阻塞在
1
2
3
4
5
if (time_first_point < extrapolator_->GetLastPoseTime() )
{
LOG(INFO) << "Extrapolator is still initializing.";
return nullptr;
}

launch如下,不能加<param name="/use_sim_time" value="true" />,否则cartographer一直阻塞。

1
2
3
4
5
6
7
8
9
10
11
12
13
<launch>
<node name="cartographer_node" pkg="cartographer_ros"
type="cartographer_node" args="
-configuration_directory $(find cartographer_ros)/configuration_files
-configuration_basename mit_stata_odom.lua"
output="screen">
<remap from="scan" to="base_scan" />
<remap from="imu" to="torso_lift_imu/data" />
</node>

<node name="cartographer_occupancy_grid_node" pkg="cartographer_ros"
type="cartographer_occupancy_grid_node" args="-resolution 0.05" />
</launch>

TF_REPEATED_DATA 问题

1
2
Warning: TF_REPEATED_DATA ignoring data with redundant timestamp for frame odom at time 1301326803.532597 according to authority unknown_publisher
at line 278 in /tmp/binarydeb/ros-noetic-tf2-0.7.5/src/buffer_core.cpp

这个问题困扰了我好长时间,后来发现是数据集发布了坐标转换odom_combined —— base_footprint,而cartographer也容易发布这个变换,此时仍然可以建图,但这个报警无穷无尽,没法看其他日志了。

如此解决:首先在播放bag的终端,先运行rosparam set use_sim_time true,lua设置如下

1
2
3
4
5
6
7
8
9
map_frame = "map",
tracking_frame = "imu_link",
published_frame = "odom_combined",
odom_frame = "odom_combined",

provide_odom_frame = false,
use_odometry = true,

TRAJECTORY_BUILDER_2D.use_imu_data = true

但是Rviz终端也出现了报警,按说把Rviz里的Submaps的Tracking Frame改为lua 文件中的 tracking_frame 即可,但是我改了后无效,把base_footprintbase_link都试了一遍也没有解决

1
[ WARN] Could not compute submap fading: Could not find a connection between 'map' and 'base_link' because they are not part of the same tree.Tf has two or more unconnected trees

这样导致无法获得`base_link`在`map`中的坐标

mapbase_link的关系问题

数据集中已经包含了odom_combinedbase_link之间的tf关系,运行carto之后会获得odom_combinedmap之间的关系,但是无法获得mapbase_link之间的tf,运行rosrun tf tf_echo map base_link报警如下

1
Exception thrown:Lookup would require extrapolation 367325588.004024029s into the past.  Requested time 1301326755.953867912 but the earliest data is at time 1668652343.957891941, when looking up transform from frame [base_link] to frame [map]

1668652343是当前时间的,1301326755是数据集中的时间。 目前未解决

软件所的数据集运行


不需要在任何地方设置use_sim_time,如果出现时间戳问题,关掉一切节点,重新开始roscore。launch里不能加<param name="/use_sim_time" value="true" />,否则cartographer一直阻塞。

问题:

  1. scan话题里的坐标系是laser,这是驱动程序决定的。但是tf里的坐标系变成了laser_mount_link,到了tf_bridge_.LookupToTracking那里卡住了,即使手动改了源码也不行。

  2. 坐标系名称带了/,这个其实在cartographer里会被解决,但是习惯上不要加上。

  3. 也无法获得mapbase_link的转换。

  4. tfecho map odom会发现位姿跳变极大, 还没搞清楚, 不是是否和时间戳有关

参考:
TF_OLD_DATA 问题
TF_OLD_DATA的另一种原因


move_base的服务端和客户端

find_package中添加actionlibactionlib_msgs

1
2
3
4
5
add_action_files(
DIRECTORY action
FILES
DoDishes.action
)

package.xml添加actionlib的编译和执行依赖,如下:

1
2
3
4
<build_depend>actionlib</build_depend>
<build_depend>actionlib_msgs</build_depend>
<run_depend>actionlib</run_depend>
<run_depend>actionlib_msgs</run_depend>

服务端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// action定义在 scheduler/action/Scheduler.action
typedef actionlib::SimpleActionServer<scheduler::SchedulerAction> Server;
actionlib::SimpleActionServer<scheduler::SchedulerAction> as_;

// 注册抢占回调函数
as_.registerPreemptCallback(boost::bind(&Explore::preemptCB, this))
as_.registerGoalCallback(boost::bind(&Explore::goalCB, this));


void Explore::preemptCB()
{
ROS_INFO("%s: Preempted", action_name_.c_str());
explore_ = false;
explore_state_ = STOP;
ac_.cancelAllGoals();
if(as_.isActive())
as_.setPreempted();
}

void Explore::goalCB()
{
explore_state_ = FULL_ROTATE;
as_.acceptNewGoal();
}

SimpleActionServer有构造函数可以注册ExecuteCallback回调函数,但没有其他两个回调,原型:
SimpleActionServer (ros::NodeHandle n, std::string name, ExecuteCallback execute_callback, bool auto_start)

move_base的构造函数中是这样使用的:
as_ = new MoveBaseActionServer(ros::NodeHandle(), "move_base", boost::bind(&MoveBase::executeCb, this, _1), false);

服务端as_最后需要start()来启动,再加ros::spin()阻塞。 MoveBase构造函数的最后只有前者,ros::spin()其实是在move_base_node.cpp

客户端

这里只看带回调函数的函数原型:

1
2
3
4
5
void actionlib::SimpleActionClient< ActionSpec >::sendGoal(const Goal& goal,
SimpleDoneCallback done_cb = SimpleDoneCallback(),
SimpleActiveCallback active_cb = SimpleActiveCallback(),
SimpleFeedbackCallback feedback_cb = SimpleFeedbackCallback()
)

ActionServer发送goal, 并注册回调函数。 如果调用本函数时已经有个goal处于active,状态机就会忽略goal并且开始向新的goal导航,不用发取消的请求。

  • done_cb: Callback that gets called on transitions to Done,用于监听服务器任务执行完后的相应消息以及客户端的相应处理
  • active_cb: Callback that gets called on transitions to Active,服务器任务被激活时的消息提示以及客户端的相应处理
  • feedback_cb: Callback that gets called whenever feedback for this goal is received,接收服务器的反馈消息以及客户端的相应处理
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    typedef actionlib::SimpleActionClient<move_base_msgs::MoveBaseAction> MoveBaseClient;
    MoveBaseClient ac_;

    // 示例
    ac_.sendGoal(goal,
    boost::bind(&DoDishesActionClient::DoneCb, this, _1, _2),
    boost::bind(&DoDishesActionClient::ActiveCb, this),
    boost::bind(&DoDishesActionClient::FeedbackCb, this, _1));


    void DoneCb(const actionlib::SimpleClientGoalState& state,
    const first_actionlib_sample::DoDishesResultConstPtr& result)
    {
    ROS_INFO("Finished in state [%s]", state.toString().c_str());
    ROS_INFO("Toal dish cleaned: %i", result->toal_dishes_cleaned);
    ros::shutdown();
    }

    // 当目标激活的时候,会调用一次
    void ActiveCb()
    {
    ROS_INFO("Goal just went active");
    }

    // 接收服务器的反馈信息
    void FeedbackCb(const first_actionlib_sample::DoDishesFeedbackConstPtr& feedback) {
    ROS_INFO("Got Feedback Complete Rate: %f", feedback->percent_complete);
    }

Goal的状态

GoalStatus

1
2
3
4
5
6
7
8
9
10
PENDING = 0
ACTIVE = 1
PREEMPTED = 2
SUCCEEDED = 3
ABORTED = 4
REJECTED = 5
PREEMPTING = 6
RECALLING = 7
RECALLED = 8
LOST = 9

1
2
3
4
5
6
7
cout<< ac_.getState().toString() <<endl;

// 获取goal的状态信息
// PENDING, ACTIVE, RECALLED, REJECTED, PREEMPTED, ABORTED, SUCCEEDED, LOST(isn't tracking a goal)
SimpleClientGoalState getState () const
// 输出上面的几个状态,大写字母,对调试很有用
std::string toString() const

如果当前已经在某个状态,再次设置为这个状态就会报警:比如ABORTED状态 To transition to an aborted state, the goal must be in a preempting or active state, it is currently in state: 4

参考:ROS Action动作通讯编程


连续发导航目标时的导航中断

车收到导航命令没有动,查日志如下
move_base接收的目标四元数invalid.png

在一个客户端程序里,在sendGoal之前调用了两次cancelAllGoals,结果得到下面的日志,某次goal的消息全是0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
[ INFO] [1648276976.479444090]: move_base client send goal: type: 1
target_pose:
header:
seq: 11
stamp: 1648276976.479361050
frame_id: map
pose:
position:
x: -3.9901
y: 0.427604
z: 0
orientation:
x: 0
y: 0
z: 0
w: 1

[ INFO] [1648276976.492017723]: move base task just went active
[ WARN] [1648276979.458537608]: 32 receive cmd: position:
x: -4.01633
y: 0.440816
z: 0
orientation:
x: 0
y: 0
z: 0
w: 1

[ INFO] [1648276979.494004129]: actionlib result state: PREEMPTED
[ INFO] [1648276979.494157917]: actionlib result: result: 0
result_pose:
header:
seq: 0
stamp: 0.000000000
frame_id:
pose:
position:
x: 0
y: 0
z: 0
orientation:
x: 0
y: 0
z: 0
w: 0

[ INFO] [1648276979.494228945]: actionDoneCb state PREEMPTED
[ INFO] [1648276979.494308493]: move_base client send goal: type: 1
target_pose:
header:
seq: 12
stamp: 1648276979.494253319
frame_id: map
pose:
position:
x: 0
y: 0
z: 0
orientation:
x: 0
y: 0
z: 0
w: 0

[ERROR] [1648276979.504876032]: BUG: Got a transition to CommState [PENDING] when our in SimpleGoalState [DONE]
[ERROR] [1648276979.504963821]: BUG: Got a transition to CommState [RECALLING] when our in SimpleGoalState [DONE]
[ERROR] [1648276979.514087412]: BUG: Got a transition to CommState [PREEMPTING] when in SimpleGoalState [DONE]
[ERROR] [1648276979.515084871]: BUG: Got a second transition to DONE

结果导致move_base漏接了一个目标,最后取消两个cancelAllGoals就好了

  • cancelAllGoals()[inline]: Cancel all goals currently running on the action server. This preempts all goals running on the action server at the point that this message is serviced by the ActionServer.

  • cancelGoal()[inline]: Cancel the goal that we are currently pursuing

cancellAllGoals()cancellGoal()经常放在SIGINT信号的回调函数里,用于停止所有actions

参考: elektron_ballcollector


智能指针(三) weak_ptr

weak_ptr本身也是一个模板类,但是不能直接用它来定义一个智能指针的对象,只能配合shared_ptr来使用,可以将shared_ptr的对象赋值给weak_ptr,并且这样并不会改变引用计数的值。查看weak_ptr的代码时发现,它主要有lockswapresetexpiredoperator=use_count几个函数,与shared_ptr相比多了lockexpired函数,但是却少了get函数,甚至连operator*operator->都没有

weak_ptr解决循环引用

weak_ptr必须跟shared_ptr配合使用, 它用于解决shared_ptr的死锁问题,如果两个shared_ptr一直互相引用,那么它们的引用计数永远不是0,资源永远不释放,这样实际造成了内存泄露。weak_ptr并不拥有其指向的对象,让weak_ptr指向shared_ptr所指向对象,对象的引用计数并不会增加

循环引用的情况:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Son;
class Father
{
public:
~Father() {cout<<" father 析构"<<endl; }
shared_ptr<Son> son_;
};
class Son {
public:
~Son() {cout<<" son 析构"<<endl; }
shared_ptr<Father> father_;
};

auto father = make_shared<Father>();
auto son = make_shared<Son>();
father->son_ = son;
son->father_ = father;

结果一个析构函数也没运行,说明对象资源没有释放。使用weak_ptr解决很简单,让Son的成员变量father改为weak_ptr类型,运行后发现两个析构都有了。

main函数退出前,Son对象的引用计数是2,而Father的引用计数是1。
son指针销毁,Son对象的引用计数变成1。
father指针销毁,Father对象的引用计数变成0,导致Father对象析构,Father对象的析构会导致它包含的son_指针被销毁,这时Son对象的引用计数变成0,所以Son对象也会被析构。

参考:shared_ptr循环引用的例子及解决方法示例