N90-29022 - Information Services & Technology

N90-29022

Robot

Acting

on

Moving

Interaction Larry

S.

Davis,

Daniel

with

DeMenthon, Madhu

Bodies

Tumbling

Thor

Vision for

University

of

tumbling

objects

Sotirios

and

Computer

Objects

Bestul,

Siddalingaiah,

Center

(RAMBO):

David

H.V.Srinivasan,

Harwood

Laboratory

Automation

Maryland,

Ziavras,

Research

College

Park,

MD

20742

Abstract Interaction Attempting

with to interact

with

We

are developing

a robot

The

of the object locations

positions

nearby

This

the object

filtering". obtained

The object by matching

computing

speed,

perspective

to make possible

1

a camera,

are determined estimate

in a sequence

is used

into

heuristic

which,

given

actions

a sequence

of

a complete geometric features in images of

of images,

trajectories

plan

and

a motion

of the position

the tasks.

in space of a triple

of rotations

of the robot

estimate

symmetric by "sector

a product

interpolations Machine CM-2

motion,

vision uses parallel algorithms for image enhancement by detection by local gradient operators, and corner extraction

for achieving

to plan

expand. operator

and

a scaled

accumulating position hypotheses of model features. To maximize

of features

is obtained

orthographic

by decomposing

projection.

This

allows

us

at each stage of the decomposition. The position hypotheses for each triples and image feature triples are calculated in parallel. Trajectory

and

dynamic

programming

between initial and with 16K processors.

techniques.

goal

Then

trajectories.

trajectories

All the parallel

are

created

algorithms

run on

Introduction

The

problem

of

robotic

mostly

concentrated

bodies

has

with

seen

a camera One

strategies

general

enough

structure During

little

gripping

for intercepting

motion the

chance and

of gravity assembling

of this structural

might

would

element

target

the object

orientation

along

[1-11].

could

be a translating of

motion,

the moving

reach plan element,

251

which

but

should

robot

and required

complete

of an autonomous vehicle.

research

has

of moving allow

a robot

lost

break

a robotic

and

there

to bring these

actions.

the

arm

with

loose

in space.

vehicle

able

But our approach

example, autonomously

could

motion,

tumbling

and

For

of interacting

the actions

years,

in the presence

system

as another

elements

of the

of robotics

development

in space.

the capability

one of the building

in recent

object.

such

as robotics

area

a control

be the

on the ground

attention The

developing

on a moving

such

require

is out

to estimate

considerable

are

of research

domains

process

element

We

of actions

a moving

to other

received

environments

[12-16].

a sequence

to be applied

has

in static

so far

for this type

before

spot

activity

application

in the absence

the natural

navigation

to accomplish

a teleoperated

to react

visual

on operations little

primary

develop

have

with

in space a human

to relative

sufficient

the estimate

combines

equipped

the object

pose estimation is a Hough transform method triples of image features (corners) to triples

view

using dynamic a Connection

be able to estimate

activities in space,

tool

use of 2D lookup tables match of model feature

planning

may not

as human

and rotating

on a tumbling object. RAMBO is given module extracts and groups characteristic

motion

level edge

common

translating

(RAMBO)

tasks vision

of the object

is obtained.

More specifically, low nearest neighbor filtering,

its

system

more

object

capacities

tasks, can perform these of the object. A low level

the object.

become

a large complex

using only his visual and mental or control those actions. simple model

will

from might

robot However,

seems

building

moving

be

very

little

operator

gripper with

a

objects.

the gripper.

A human

to

to the

Then time would right

the proper

equipment,

the operator

tumbling

element

the robot

could

could

be equipped

types

of structural

grip.

While

elements

a sequence

Extrapolating

switch

with a video being

in teleoperating

to a mode in which

To be able to handle

camera,

assembled,

mode,

that in case of an emergency Analyzing

immediately

in the short time available.

and could

the trajectory

goal points

could keep track of which

the robot would already

of images,

have a database

with the various

the robot

the robot

the robot could

to the immediate

have retrieved

it would

without

describing

which

human

be reached

element

to bring

control

algorithms

for intercepting

facility

which

has the necessary

moving

objects.

components

These algorithms

could

its gripper

2

Experimental

A large

American

top left). robot

diodes

of diodes

time constraints. are developing top righ0.

3

RM-501)

light-sensitive

hit a sequence

robot

translates

vision-based

into the navigation

able to recover

with a CCD camera

and sent to the Connection

and rotates

with focusing

an object

optics are mounted

objects

(called

and a laser pointer

Machine

the target

for processing.

in this paper)

on the surface of the object.

object with its laser beam for given durations,

inside the object signal success by turning

a full computer

Summary

is equipped

are digitized

on the moving

Electronics

The vision-based

arm, RAMBO,

from the camera

arm (Mitsnbishi

Several

the

which

are

(Figure

I,

set-up

Cimflez

Images

along

fixed with respect

for testing various

be incorporated

system of a vehicle able to intercept other vehicles, or in a robotic system tumbling freely in space. This facility is described in the following section.

so

and orientation.

trajectory of a goal point of the element. Along this goal trajectory the moving element appears to the gripper, so that the gripping action can be accomplished as if in a static environment. We have set up an experimental

handled,

from its database.

in location

plan its own motion

of all the

for a proper

is being

information

of the element

the

intervention,

the geometry

could

structural

all the relevant

find the trajectory

future,

is on its own for recovering

such situations

simulation

in which the camera

inputs are replaced

is shown in Figure

1. We briefly describe

through

RAMBO's

possibly

on an indicator

A smaller

light.

space.

goal is to

subject to overall Simultaneously,

by synthetic

images

we

(Figure

1,

of Operations

control

modules of this system give more details.

loop for RAMBO from data collection

to robot

motion

control,

the functions

and refer to the sections

of the different

of this paper

which

1o

The digitizer

of the video camera

information is needed. A database coordinate system of the target. 2. A low-level

vision

3. An intermediate (Section 5).

module

vision

extracts

module

mounted

on the robot

ann

contains

a list of positions

locations

of feature points

finds the location/orientation

can grab of feature

video points

frames

when new visual

on the target,

from the digitized

in the local

image (Section

of the target in the camera

4).

coordinate

system

base coordinate

system

4.

Since the past camera

trajectory

is known,

when the frame was grabbed is known. base coordinate system (Section 6). 5. This most recent Target

Motion

target pose at a specific

Predictor,

a target trajectory

the position

of the camera

The location/orientation

in the robot

of the target is transformed

time is added to the list of target poses in location/orientation

252

space

is fitted

at previous

to the robot

times. In the

to these past target poses

and extrapolated

to the future to form a predicted

of goal points target,

around

thus moving

accomplishment

,

the goal points, Planner

transforming

subsequent

target pose estimates

The model-based the images

of the target.

Conversely, are easily

in

the geometric detected

a sequence

Machine,

target

6).

(Section

from a camera

the robot motions

(Sections

7, 8, 9).

10). The camera

coordinate

system

necessary

If the subtasks

are

trajectories

are used for

to an absolute

coordinate

description

of the target

that feature points be extracted

vision module.

structure,

the target is a polyhedra, is partly

of basic

local

edge thinning

are connected

k vertices,

specific

Feature

comers

should

from each of

points in an image could be images

of

contain

letters,

the 3D locations

vertices,

etc .....

of the feature

points

which

operations

[17, 18] implemented detection-

followed

by edges in the image.

to a possible

a grid of k x k cells.

2. Copy the addresses

and the feature points that we use are the vertices of the polyhedra,

to this type of feature

and vertex

we use the processing

each assigned

1. Enable

of the for the

Planner calculates

described in the next section requires

the

analysis

[19], edge detection, of vertices

of reference

of the robot has to follow

in images.

so that our image

Given

(Section

trajectories

order

This is the task of the low-level

holes

In our experiments, involve

is fixed in the frame

trajectories

Vision pose estimation

small

camera

finds an optimal

the predicted

6).

Low-level

pairs

and the resulting

the Motion

(Section

-which

the Robot Motion

not ordered,

We also obtain

that one of the joints

of the total action

From the predicted goal point trajectories,

system

of

A goal point is a location

in the frame of the robot base-

of one of the subtasks

for following

4

the target.

target trajectory.

of comer

points

Our image

on the Connection by some simple

This last operation

cells in the upper-triangle

edge between Disable

detection.

processing Machine

processing

proceeds

algorithms

--enhancement

to determine

which

as follows:

of a k x k array of cells in the Connection

vertices.

cells in the lower triangle

of the array.

and the incident

angles

of their edges

grid-scan,

the incident

to the cells in the diagonal

of

the grid 3. By horizontal

grid-scan

from the diagonal

4.

marked

the vertices,

they have a pair of collinear

connected

but this would

(Note that we could be much

angles

and the addresses

of the vertices

to all the ceils in the array.

active cells (i, j) now has all the information

whether

as being

spread

cells along rows and columns

Each non-diagonal can determine

and vertical

incident

about the i-th and j-th vertices; edges.

If they

also count the number

more costly

on the Connection

do, then that

of edge pixels along Machine

these cells

vertex

pair is

the line joining

and not worthwhile

for our

purposes). From

this algorithm

to it. We then produce

we obtain

a list of all the vertices

a list of all the image

This list of image

triples

input

list for the triples

is a similar

triples

consisting

is input to the intermediate-level of world

vertices

with for each vertex

vision

of one vertex module

of the target,

target.

253

a sublist of the vertices with two vertices

described

connected

in the next section.

from the geometric

database

connected to it.

The other

describing

the

5

Intermedlate-level

The pose estimation

algorithm

1. Pose estimation 2. Standard

rotations

3. Paraperspective

allows

The feature points between

them

in the image, such image which

detected

angle).

the adjacent

edges

The main algorithm

The algorithm

(the reference

of points

if the feature

of the reference

of target features[20].

is implemented

into triples

vertex),

on the Connection

(the image

the length

could

be considered,

are actual

with each triple

by a standard

in this position,

which brings

an image triangle

triangle/target

triangle There

target triangle

of good

the reference

parameters

in space,

when

can be described

triangle

pair, a 21) lookup

if we approximate

the reference

the true perspective

table per target

triangle,

to

vertex

gives

of the image

triangle

3D position

of the target center,

projections

analyzes

In this case,

are clustered

its first image,

the system

gives better results

pose estimates

to the camera

to the image center.

table.

(located

at the image

only, the reference

vertex),

center)

angle,

the

and a size factor. the orientation

two possible

of

approximation

orientations

of this

angle and edge ratio of its image are entered.

transformations

few consistent

triangles

over the total number

with a paraperspective

which

the corresponding

clustering

these image

matches

table can be used to determine

target triangle,

are visible.

in

vertex of the triangle

by three parameters

6. The preliminary

However,

three

triangles

are read in a 2D lookup

5. Comparing the size of the image triangle to the size of the target triangle find the distance of the target triangle from the camera lens center.

When RAMBO

producing image

The standard rotation corresponds

of the two edges adjacent to the reference

is one 2D lookup

7. The target center

are

to actual edges

of points

and to only match

the proportion

rotation.

vertex in the image, rotation

edge ratio (ratio of the lengths

[22,23].

used

Details

Each image triangle

to only consider

2. Image triangles are then rotated in the image plane around the reference to bring one edge into coincidence with the image x-axis.

the target

triangles).

have to correspond

it is useful

edges,

computations Machine.

of the two adjacent sides, and the angle

sides do not necessarily

This increases

around the center of projection

4. For each image

Target

[22, 23].

points are vertices,

vertex

characteristics.

is transformed

For each reference

3. Once

the

steps are as follows:

1. Each image triangle rotation

to triples

projection

These adjacent

triples

However,

world triangles with similar of possible matches.

features

in an image are grouped

and all distinct

of

use of 2I) look-up tables to replace the cosily numerical

by one of its vertices

triangles.

of image

to perspective

[20, 24-26].

(the reference

Estimation

[21].

the extensive

methods

Pose

three ideas:

triples

approximation

by similar previous given in [27].

can be described

combines

by matching

camera

This combination

Vision:

to identify

can be reversed

if most improper

orientation,

we can then

the actual 3D pose

of the

of the target center.

the pose of the whole target.

combinations matches

of the target have been obtained.

254

to obtain

and the image

it does not have any a priori

uses all the possible

of known

knowledge

of which

of target triangles

are removed, The system

and image

and it is possible

can also avoid

feature

triangles triangles.

to do so after a

considering

matches

for target analysis

triangles

which

is likely

are at a nearly

to perform

For a target producing around

one second

6

system

with 16K processors

in the camera

does not approximate triangle

but without

since

floating

for these

true perspective

combinations,

triangles

image

well.

each pose calculation

takes

point processors.

of the camera

From a sequence and three rotation

system

and rotation

itself

is known,

matrix

of the target

is set in motion

by the robot

and it is straightforward

to get the

was taken and to find the target position

the robot must be able to predict future positions are points

in six-dimensional

each with a time label. We can fit a parametric

to each of these sequences of time. Calculating

the camera

system

at the time the image

These target positions

angles),

However,

coordinate

vector

at this

system.

of target positions,

plans of actions.

system.

in an absolute

coordinate

coordinate

from an image gives the translation

coordinate

of the camera

time in an absolute

construct

and paraperspective

of a target position

The trajectory

position

with the lines of sight,

Prediction

The computation

ann.

angle

less than 16K image triangle/target

on a CM-2

Motion

coordinate

poorly

grazing

of coordinates.

these functions

The target trajectory

space

function

parameters

of time, such as a polynomial,

is then described

for a future value t of the time parameter

of the target in order to

(three translation

parametrically

by six functions

will give a predicted

target position

at this future time. If RAMBO axes of inertia,

is used in space and the axes of the frame then in the absence

as well as the rotations

around

element

could be tethered

motions

are possible,

7

Task

the three axes.

and

Trajectory

complex

goal can be decomposed

joint

each complex

action

point is defined coordinate Once

satellite. satellite

of simple

makes

a structural

what ranges

and types of

the target trajectory.

subgoals,

the simplifying

and that each This joint

The fixed position

in a data base of actions and three

subgoal

assumption

with respect

required

to each target.

for the orientation

with the

to the target

All goal points

specific

that a

can be performed

has to "tag along"

will be called a goal point.

three for the location

joint is moving

A database

here,

a subgoal containing

frame of reference

along

Each

that for goal

of the tagging joint, in the

the target so that it does not move with respect

the finer details since

required

it is equivalent

for a robot

by the subgoal.

to programming

ann on a space

shuttle

of the satellite

would

the geometry

joint is the wrist, and the goal point is a position

reference.

the wrist

the subgoal motion

might

to the target, the more

The programming

a robot

to perform

be grabbing

also specify

is positioned completion,

with respect

at the goal point since the satellite

--

to the wrist

is tumbling

as would

be needed

255

--

constant

the joints

if the satellite

a task on a fixed

in what position

above the handle given

which requires

of these distal

a handle

-- the wrist of the robot arm should be in order for the end effector

Here the tagging

same grabbing

be uniform,

(for example,

the best way to parameterize

to the target.

of the robot.

a subgoal

can be predefined

can be used to perform

For example

arm during

could occur

should

of the target.

the tagging

Once

cases

with the principal

system

the target could specify

currently

with respect

joint

to complete

on a target

will not be considered

object.

the tagging

follow

RAMBO

into a sequence

by six coordinates,

system

distal joints joints

must

planning,

in a fixed position

thus we call this joint

coincide

Planning

task and trajectory

with one joint of the robot

of the target

of this coordinate

But more complex

and this data could be used to determine

to perform

the tagging

of reference

forces the translation

at one end). The data base describing

In order

target,

of external

on a tumbling --

fixed in the

to grab the handle.

in the satellite

motion

control

of the end effector were not moving.

frame of

of the robot require

the

In our experimental illuminating

setup,

a light-sensitive

is a laser pointer

mounted

we have concentrated

diode mounted

on reaching

the goal

arm.

Each diode

is mounted

that tube, so that the laser beam must be roughly

lens to trigger the electronic

circuits

8

Bringing

Suppose

the original

The goal joint

tg.

trajectory

from

The reaching The operation

at these reaches

times,

trajectory

of the tagging

its original

trajectory

fr(t)

to

joint

space

trajectory

of the diodes.

a Goal

is given

fo(t),

so that the velocities Once

duration

change

by the vector

T.

when

of

The source of light

inside a tube at the focal with the optical

axis of the

axis, and by the orientation

of the space

fg(t).

Target

is the vector fo(t)

(Figure

2).

At time to we want the tagging

at time tg = to + T, on the goal trajectory 1Yo(t) at time to and to trajectory fg(t)

Furthermore,

smoothly

a launching

Point

and to "land"

consists

Thus a goal point for the laser tool is

in location/orientation

should be equal to trajectory

will last for the reaching

the goal trajectory.

the output

aligned

from the lens of a diode along the optical

Joint

in location/direction

to "launch"

fg(t).

which control

at a short distance

a Tagging

Each subgoal

on the surface of the target for a given duration.

on the tool plate of the robot

point of a lens which closes

defined by the positions of this axis.

points.

the first derivatives

the robot

time to and a reaching

departs duration

from

should

its original

T are chosen,

at time

also be equal trajectory

and

the end points

of

the reaching trajectory fr(t), as well as the first derivatives of the reaching trajectory at these points are known. These boundary conditions are enough to define fr(t) in terms of a parametric cubic spline, a curve in which all the coefficients of the six cubic polynomials of time can be calculated. In our experiments interpolation function continuous derivatives

we have also

explored

an alternative

&(t) = 2 fT(t) and the reaching

method

which

uses

fr(t) which is 0 at time to, 1 at time tg, with horizontal in the time interval. This function is expressed by

= --2

trajectory is the interpolated F_(t)

(;-),

a scalar piecewise

derivatives

+1,

T/2