Storm 疑难杂症。

疑难解答:
这个页面列出了一些人们在使用storm时遇到的问题和他们的解决方案。
worker 进程 启动时没有堆栈信息。
可能的情形:
Topology 只在一台机器的不同worker中运行,但是在多节点上运行会遇到问题或崩溃。
解决方法:
你可能配置错了子网,在其中节点不能通过hostname(机器名)定位其他的节点。ZeroMQ  有时不能解析主机的时候不能处理数据。有两种解决办法。
1.在/etc/hosts 中做hostname 和 ip 的映射
2.假设内部DNS服务器,这样节点都能通过hostname定位其他节点。
节点之间不能够通信
可能的现象:
每一个spout tuple 都失败
进程不工作
解决方法:
storm 不能使用ipv6 工作。你需要强制使用ipv4 添加 -Djava.net.preferIPv4Stack=true 到 supervisor 子选项上,然后 重启supervisor。
你可能配置错了子网 具体请看上一条:worker 进程 启动时没有堆栈信息。
topology 不久之后就停止处理tuple
现象:
处理工作一段时间内是正常的,然后突然停止了。spout tuple 开始全部失败。
解决办法:
这是使用了ZeroMQ 2.1.10的问题。回退到版本ZeroMQ 2.1.7。
不是所有的supervisor都出现在Storm UI 上
现象:
有些supervisor进程在Storm UI 上看不到
刷先后很多supervisor 改变
解决办法:
确定 supervisor的本地文件夹是独立的 (例如,,不是一个通过NFS分享的本地文件夹)
试着删除本地文件夹,然后重启。Supervisor 创建一个独有的id,然后把它存储在本地。 当这个id被复制到其他节点上,storm就会变得茫然了。

“Multiple defaults.yaml found” error

现象:
当你用“storm jar”部署 topology是,你会得到这个错误
解决办法:
极有可能是你把 Strom的jar包也打包到你的topology jar包中了。 当打包你的topology jar 时,不要包含 storm 的jars ,storm 将会通过classpath 把他们放到里面。

“NoSuchMethodError” when running storm jar

现象:
当运行storm jar 时 你得到一个模糊的”NoSuchMethodError”
解决办法:
你这在部署的topology的运行环境与你打包topology时用的storm不是一个版本。确定你编译topology用的storm的版本与你用的storm的客户端是同一个版本。
附英文原文:
https://github.com/nathanmarz/storm/wiki/Troubleshooting

Troubleshooting

Troubleshooting

This page lists issues people have run into when using Storm along with their solutions.

Worker processes are crashing on startup with no stack trace

Possible symptoms:

  • Topologies work with one node, but workers crash with multiple nodes

Solutions:

  • You may have a misconfigured subnet, where nodes can’t locate other nodes based on their hostname. ZeroMQ sometimes crashes the process when it can’t resolve a host. There are two solutions:
    • Make a mapping from hostname to IP address in /etc/hosts
    • Set up an internal DNS so that nodes can locate each other based on hostname.

Nodes are unable to communicate with each other

Possible symptoms:

  • Every spout tuple is failing
  • Processing is not working

Solutions:

  • Storm doesn’t work with ipv6. You can force ipv4 by adding -Djava.net.preferIPv4Stack=true to the supervisor child options and restarting the supervisor.
  • You may have a misconfigured subnet. See the solutions for Worker processes are crashing on startup with no stack trace

Topology stops processing tuples after awhile

Symptoms:

  • Processing works fine for awhile, and then suddenly stops and spout tuples start failing en masse.

Solutions:

  • This is a known issue with ZeroMQ 2.1.10. Downgrade to ZeroMQ 2.1.7.

Not all supervisors appear in Storm UI

Symptoms:

  • Some supervisor processes are missing from the Storm UI
  • List of supervisors in Storm UI changes on refreshes

Solutions:

  • Make sure the supervisor local dirs are independent (e.g., not sharing a local dir over NFS)
  • Try deleting the local dirs for the supervisors and restarting the daemons. Supervisors create a unique id for themselves and store it locally. When that id is copied to other nodes, Storm gets confused.

“Multiple defaults.yaml found” error

Symptoms:

  • When deploying a topology with “storm jar”, you get this error

Solution:

  • You’re most likely including the Storm jars inside your topology jar. When packaging your topology jar, don’t include the Storm jars as Storm will put those on the classpath for you.

“NoSuchMethodError” when running storm jar

Symptoms:

  • When running storm jar, you get a cryptic “NoSuchMethodError”

Solution:

  • You’re deploying your topology with a different version of Storm than you built your topology against. Make sure the storm client you use comes from the same version as the version you compiled your topology against.
原文地址:https://www.cnblogs.com/qgxiaoguang/p/2914462.html