本文的应用环境是指在使用他人已编译好,且不方便自己重新再进行编译的Padavan路由系统下,如何添加软件。
本文涉及的frp故障排查版本为0.51.2,如果后续更新可能会有不同,仅供参考。
Padavan添加软件
Padavan的系统(或者说目前刷路由器,非软路由,可用的系统里)大部分目录都是重启即丢失的。
但需要注意的是,由于Padavan系统的特殊性,关键目录以外的目录均为 tmpfs ,可以理解为是把数据暂存在内存上。因此在/etc/storage目录里所做的修改,如果没有执行保存脚本,就并没有真正的写入 Rom 里,重启之后文件还会丢失。 —— panFake的文章
那么我们的第一个目标就是把frp添加到/etc/storage目录下,并且烧录,使其重启可用。
路由器还是能插U盘的更好一些
我的是红米AC2100,刷入Padavan后,SSH到路由器可见,可用空间大概也就几M,而最新版本的frp大概需要10M,空间严重不足。如果你的路由器能够插U盘,那么方案就变成了:
- 把frp放在U盘指定目录
- 路由器启动时自动挂载U盘
- 拷贝可执行文件到/tmp目录并启动(或者直接在U盘上启动)
不过上述方案不适合我,接下来介绍我的处理。
首要条件需要开启Padavan的SSH功能,这里不赘述,自行搜索。
UPX,好用
本文没有使用UPX的最新版,个人使用的3.95
经过自己的测试,UPX运行的平台与需要修改的可执行文件平台无关,因此如果压缩有问题可能是你的原本的可执行文件的平台不对。
红米AC2100使用的是联发科的MT7621A,个人测试使用的frp版本应为misple版本。
利用UPX工具,使用--best
参数,将frpc从10M压缩到了4M左右,如果你的空间还是不足,可以尝试压缩更旧版本的frpc,我自己压缩的0.39版本可以压缩到2M左右。
压缩好后将frpc和配置文件放在/etc/storage目录下,我自己放在了/etc/storage/frp/下,下文基于此。
放置好后建议先测试一下能否正常连接,因为每次重启后会还原到烧录的版本,如果配置的有问题,还需要重新烧录。
如果连接不上,是打不开frpc的管理面板的,我觉得这一点上frpc可以优化一下,至少可以判断是否启动成功。
如果路由器开启了防火墙,记得放行管理面板的端口和与服务端连接用的端口。不建议直接iptables,在管理面板里找Upnp配置即可。
如果测试没问题可以正常连接你的frps服务端,那么到系统设置里可以准备烧录了,这里借用一张恩山老哥的图片。
选择“保存/etc/storage/内容到闪存”的提交,即可。
注意检查保存时的log,如果出现类似于:
Aug 18 02:37:15 Storage save: Save storage files to MTD partition "/dev/mtdblock6"
Aug 18 02:37:26 Storage save: Invalid storage final data size: 4240811
一般是指/etc/storage容量不足
提交完毕,在系统设置 - 脚本,里找到“路由器启动时”(或者类似的字样),增加一行nohup启动脚本:
# 可执行文件更改为你自己的目录
nohup /etc/storage/frp/frpc -c /etc/storage/frp/frpc.ini >/dev/null 2>&1 &
选择输出到/dev/null是因为我在frpc.ini里配置了日志输出目录,建议输出到tmp避免闪存写满。
可以重启路由器测试frpc是否自动启动好并且连接上了。
frp故障排除
然而我在测试时遭遇了frp用户最常见的错误:
2023/08/14 13:53:31 [I] [service.go:296] EOF
EOF这种错误一般出现在读取返回体没有正确读取的情况,可能跟conn等都有关系,只有一个EOF这难排查。
找到frp的源码,在client/service.go下增加log
附上我自己debug用的service.go代码:
// Copyright 2017 fatedier, [email protected]
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package client
import (
"context"
"crypto/tls"
"fmt"
"io"
"math/rand"
"net"
"runtime"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/fatedier/golib/crypto"
libdial "github.com/fatedier/golib/net/dial"
fmux "github.com/hashicorp/yamux"
quic "github.com/quic-go/quic-go"
"github.com/fatedier/frp/assets"
"github.com/fatedier/frp/pkg/auth"
"github.com/fatedier/frp/pkg/config"
"github.com/fatedier/frp/pkg/msg"
"github.com/fatedier/frp/pkg/transport"
"github.com/fatedier/frp/pkg/util/log"
utilnet "github.com/fatedier/frp/pkg/util/net"
"github.com/fatedier/frp/pkg/util/util"
"github.com/fatedier/frp/pkg/util/version"
"github.com/fatedier/frp/pkg/util/xlog"
)
func init() {
crypto.DefaultSalt = "frp"
// TODO: remove this when we drop support for go1.19
rand.Seed(time.Now().UnixNano())
}
// Service is a client service.
type Service struct {
// uniq id got from frps, attach it in loginMsg
runID string
// manager control connection with server
ctl *Control
ctlMu sync.RWMutex
// Sets authentication based on selected method
authSetter auth.Setter
cfg config.ClientCommonConf
pxyCfgs map[string]config.ProxyConf
visitorCfgs map[string]config.VisitorConf
cfgMu sync.RWMutex
// The configuration file used to initialize this client, or an empty
// string if no configuration file was used.
cfgFile string
exit uint32 // 0 means not exit
// service context
ctx context.Context
// call cancel to stop service
cancel context.CancelFunc
}
func NewService(
cfg config.ClientCommonConf,
pxyCfgs map[string]config.ProxyConf,
visitorCfgs map[string]config.VisitorConf,
cfgFile string,
) (svr *Service, err error) {
svr = &Service{
authSetter: auth.NewAuthSetter(cfg.ClientConfig),
cfg: cfg,
cfgFile: cfgFile,
pxyCfgs: pxyCfgs,
visitorCfgs: visitorCfgs,
ctx: context.Background(),
exit: 0,
}
return
}
func (svr *Service) GetController() *Control {
svr.ctlMu.RLock()
defer svr.ctlMu.RUnlock()
return svr.ctl
}
func (svr *Service) Run(ctx context.Context) error {
ctx, cancel := context.WithCancel(ctx)
svr.ctx = xlog.NewContext(ctx, xlog.New())
svr.cancel = cancel
xl := xlog.FromContextSafe(svr.ctx)
// set custom DNSServer
if svr.cfg.DNSServer != "" {
dnsAddr := svr.cfg.DNSServer
if _, _, err := net.SplitHostPort(dnsAddr); err != nil {
dnsAddr = net.JoinHostPort(dnsAddr, "53")
}
// Change default dns server for frpc
net.DefaultResolver = &net.Resolver{
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
return net.Dial("udp", dnsAddr)
},
}
}
// login to frps
for {
conn, cm, err := svr.login()
if err != nil {
xl.Warn("login to server failed: %v", err)
// if login_fail_exit is true, just exit this program
// otherwise sleep a while and try again to connect to server
if svr.cfg.LoginFailExit {
return err
}
util.RandomSleep(5*time.Second, 0.9, 1.1)
} else {
// login success
ctl := NewControl(svr.ctx, svr.runID, conn, cm, svr.cfg, svr.pxyCfgs, svr.visitorCfgs, svr.authSetter)
ctl.Run()
svr.ctlMu.Lock()
svr.ctl = ctl
svr.ctlMu.Unlock()
break
}
}
go svr.keepControllerWorking()
if svr.cfg.AdminPort != 0 {
// Init admin server assets
assets.Load(svr.cfg.AssetsDir)
address := net.JoinHostPort(svr.cfg.AdminAddr, strconv.Itoa(svr.cfg.AdminPort))
err := svr.RunAdminServer(address)
if err != nil {
log.Warn("run admin server error: %v", err)
}
log.Info("admin server listen on %s:%d", svr.cfg.AdminAddr, svr.cfg.AdminPort)
}
<-svr.ctx.Done()
// service context may not be canceled by svr.Close(), we should call it here to release resources
if atomic.LoadUint32(&svr.exit) == 0 {
svr.Close()
}
return nil
}
func (svr *Service) keepControllerWorking() {
xl := xlog.FromContextSafe(svr.ctx)
maxDelayTime := 20 * time.Second
delayTime := time.Second
// if frpc reconnect frps, we need to limit retry times in 1min
// current retry logic is sleep 0s, 0s, 0s, 1s, 2s, 4s, 8s, ...
// when exceed 1min, we will reset delay and counts
cutoffTime := time.Now().Add(time.Minute)
reconnectDelay := time.Second
reconnectCounts := 1
for {
<-svr.ctl.ClosedDoneCh()
if atomic.LoadUint32(&svr.exit) != 0 {
return
}
// the first three attempts with a low delay
if reconnectCounts > 3 {
util.RandomSleep(reconnectDelay, 0.9, 1.1)
xl.Info("wait %v to reconnect", reconnectDelay)
reconnectDelay *= 2
} else {
util.RandomSleep(time.Second, 0, 0.5)
}
reconnectCounts++
now := time.Now()
if now.After(cutoffTime) {
// reset
cutoffTime = now.Add(time.Minute)
reconnectDelay = time.Second
reconnectCounts = 1
}
for {
if atomic.LoadUint32(&svr.exit) != 0 {
return
}
xl.Info("try to reconnect to server...")
conn, cm, err := svr.login()
if err != nil {
xl.Warn("reconnect to server error: %v, wait %v for another retry", err, delayTime)
util.RandomSleep(delayTime, 0.9, 1.1)
delayTime *= 2
if delayTime > maxDelayTime {
delayTime = maxDelayTime
}
continue
}
// reconnect success, init delayTime
delayTime = time.Second
ctl := NewControl(svr.ctx, svr.runID, conn, cm, svr.cfg, svr.pxyCfgs, svr.visitorCfgs, svr.authSetter)
ctl.Run()
svr.ctlMu.Lock()
if svr.ctl != nil {
svr.ctl.Close()
}
svr.ctl = ctl
svr.ctlMu.Unlock()
break
}
}
}
// login creates a connection to frps and registers it self as a client
// conn: control connection
// session: if it's not nil, using tcp mux
func (svr *Service) login() (conn net.Conn, cm *ConnectionManager, err error) {
xl := xlog.FromContextSafe(svr.ctx)
cm = NewConnectionManager(svr.ctx, &svr.cfg)
if err = cm.OpenConnection(); err != nil {
xl.Info("Throwed from line:249, the origin error msg is: ")
xl.Info(err.Error())
return nil, nil, err
}
defer func() {
if err != nil {
cm.Close()
}
}()
conn, err = cm.Connect()
if err != nil {
xl.Info("Throwed from line:261, the origin error msg is: ")
xl.Info(err.Error())
return
}
loginMsg := &msg.Login{
Arch: runtime.GOARCH,
Os: runtime.GOOS,
PoolCount: svr.cfg.PoolCount,
User: svr.cfg.User,
Version: version.Full(),
Timestamp: time.Now().Unix(),
RunID: svr.runID,
Metas: svr.cfg.Metas,
}
// Add auth
if err = svr.authSetter.SetLogin(loginMsg); err != nil {
xl.Info("Throwed from line:280, the origin error msg is: ")
xl.Info(err.Error())
return
}
if err = msg.WriteMsg(conn, loginMsg); err != nil {
xl.Info("Throwed from line:286, the origin error msg is: ")
xl.Info(err.Error())
return
}
var loginRespMsg msg.LoginResp
_ = conn.SetReadDeadline(time.Now().Add(60 * time.Second))
if err = msg.ReadMsgInto(conn, &loginRespMsg); err != nil {
xl.Info("Throwed from line:294, the origin error msg is: ")
xl.Info(err.Error())
return
}
_ = conn.SetReadDeadline(time.Time{})
if loginRespMsg.Error != "" {
err = fmt.Errorf("%s", loginRespMsg.Error)
xl.Error("%s", loginRespMsg.Error)
return
}
svr.runID = loginRespMsg.RunID
xl.ResetPrefixes()
xl.AppendPrefix(svr.runID)
xl.Info("login to server success, get run id [%s]", loginRespMsg.RunID)
return
}
func (svr *Service) ReloadConf(pxyCfgs map[string]config.ProxyConf, visitorCfgs map[string]config.VisitorConf) error {
svr.cfgMu.Lock()
svr.pxyCfgs = pxyCfgs
svr.visitorCfgs = visitorCfgs
svr.cfgMu.Unlock()
svr.ctlMu.RLock()
ctl := svr.ctl
svr.ctlMu.RUnlock()
if ctl != nil {
return svr.ctl.ReloadConf(pxyCfgs, visitorCfgs)
}
return nil
}
func (svr *Service) Close() {
svr.GracefulClose(time.Duration(0))
}
func (svr *Service) GracefulClose(d time.Duration) {
atomic.StoreUint32(&svr.exit, 1)
svr.ctlMu.RLock()
if svr.ctl != nil {
svr.ctl.GracefulClose(d)
svr.ctl = nil
}
svr.ctlMu.RUnlock()
if svr.cancel != nil {
svr.cancel()
}
}
type ConnectionManager struct {
ctx context.Context
cfg *config.ClientCommonConf
muxSession *fmux.Session
quicConn quic.Connection
}
func NewConnectionManager(ctx context.Context, cfg *config.ClientCommonConf) *ConnectionManager {
return &ConnectionManager{
ctx: ctx,
cfg: cfg,
}
}
func (cm *ConnectionManager) OpenConnection() error {
xl := xlog.FromContextSafe(cm.ctx)
// special for quic
if strings.EqualFold(cm.cfg.Protocol, "quic") {
var tlsConfig *tls.Config
var err error
sn := cm.cfg.TLSServerName
if sn == "" {
sn = cm.cfg.ServerAddr
}
if cm.cfg.TLSEnable {
tlsConfig, err = transport.NewClientTLSConfig(
cm.cfg.TLSCertFile,
cm.cfg.TLSKeyFile,
cm.cfg.TLSTrustedCaFile,
sn)
} else {
tlsConfig, err = transport.NewClientTLSConfig("", "", "", sn)
}
if err != nil {
xl.Warn("fail to build tls configuration, err: %v", err)
return err
}
tlsConfig.NextProtos = []string{"frp"}
conn, err := quic.DialAddr(
cm.ctx,
net.JoinHostPort(cm.cfg.ServerAddr, strconv.Itoa(cm.cfg.ServerPort)),
tlsConfig, &quic.Config{
MaxIdleTimeout: time.Duration(cm.cfg.QUICMaxIdleTimeout) * time.Second,
MaxIncomingStreams: int64(cm.cfg.QUICMaxIncomingStreams),
KeepAlivePeriod: time.Duration(cm.cfg.QUICKeepalivePeriod) * time.Second,
})
if err != nil {
return err
}
cm.quicConn = conn
return nil
}
if !cm.cfg.TCPMux {
return nil
}
conn, err := cm.realConnect()
if err != nil {
return err
}
fmuxCfg := fmux.DefaultConfig()
fmuxCfg.KeepAliveInterval = time.Duration(cm.cfg.TCPMuxKeepaliveInterval) * time.Second
fmuxCfg.LogOutput = io.Discard
fmuxCfg.MaxStreamWindowSize = 6 * 1024 * 1024
session, err := fmux.Client(conn, fmuxCfg)
if err != nil {
return err
}
cm.muxSession = session
return nil
}
func (cm *ConnectionManager) Connect() (net.Conn, error) {
if cm.quicConn != nil {
stream, err := cm.quicConn.OpenStreamSync(context.Background())
if err != nil {
return nil, err
}
return utilnet.QuicStreamToNetConn(stream, cm.quicConn), nil
} else if cm.muxSession != nil {
stream, err := cm.muxSession.OpenStream()
if err != nil {
return nil, err
}
return stream, nil
}
return cm.realConnect()
}
func (cm *ConnectionManager) realConnect() (net.Conn, error) {
xl := xlog.FromContextSafe(cm.ctx)
var tlsConfig *tls.Config
var err error
tlsEnable := cm.cfg.TLSEnable
if cm.cfg.Protocol == "wss" {
tlsEnable = true
}
if tlsEnable {
sn := cm.cfg.TLSServerName
if sn == "" {
sn = cm.cfg.ServerAddr
}
tlsConfig, err = transport.NewClientTLSConfig(
cm.cfg.TLSCertFile,
cm.cfg.TLSKeyFile,
cm.cfg.TLSTrustedCaFile,
sn)
if err != nil {
xl.Warn("fail to build tls configuration, err: %v", err)
return nil, err
}
}
proxyType, addr, auth, err := libdial.ParseProxyURL(cm.cfg.HTTPProxy)
if err != nil {
xl.Error("fail to parse proxy url")
return nil, err
}
dialOptions := []libdial.DialOption{}
protocol := cm.cfg.Protocol
switch protocol {
case "websocket":
protocol = "tcp"
dialOptions = append(dialOptions, libdial.WithAfterHook(libdial.AfterHook{Hook: utilnet.DialHookWebsocket(protocol, "")}))
dialOptions = append(dialOptions, libdial.WithAfterHook(libdial.AfterHook{
Hook: utilnet.DialHookCustomTLSHeadByte(tlsConfig != nil, cm.cfg.DisableCustomTLSFirstByte),
}))
dialOptions = append(dialOptions, libdial.WithTLSConfig(tlsConfig))
case "wss":
protocol = "tcp"
dialOptions = append(dialOptions, libdial.WithTLSConfigAndPriority(100, tlsConfig))
// Make sure that if it is wss, the websocket hook is executed after the tls hook.
dialOptions = append(dialOptions, libdial.WithAfterHook(libdial.AfterHook{Hook: utilnet.DialHookWebsocket(protocol, tlsConfig.ServerName), Priority: 110}))
default:
dialOptions = append(dialOptions, libdial.WithTLSConfig(tlsConfig))
}
if cm.cfg.ConnectServerLocalIP != "" {
dialOptions = append(dialOptions, libdial.WithLocalAddr(cm.cfg.ConnectServerLocalIP))
}
dialOptions = append(dialOptions,
libdial.WithProtocol(protocol),
libdial.WithTimeout(time.Duration(cm.cfg.DialServerTimeout)*time.Second),
libdial.WithKeepAlive(time.Duration(cm.cfg.DialServerKeepAlive)*time.Second),
libdial.WithProxy(proxyType, addr),
libdial.WithProxyAuth(auth),
)
conn, err := libdial.DialContext(
cm.ctx,
net.JoinHostPort(cm.cfg.ServerAddr, strconv.Itoa(cm.cfg.ServerPort)),
dialOptions...,
)
return conn, err
}
func (cm *ConnectionManager) Close() error {
if cm.quicConn != nil {
_ = cm.quicConn.CloseWithError(0, "")
}
if cm.muxSession != nil {
_ = cm.muxSession.Close()
}
return nil
}
修改后编译一个debug版本,再放到路由器内测试。
https://github.com/fatedier/frp/blob/dev/package.sh 即这个项目的编译脚本,直接执行即可得到所需的比如misple版本。
可以看出错误位置在于:
2023/08/14 13:53:31 [I] [service.go:295] Throwed from line:294, the origin error msg is:
2023/08/14 13:53:31 [I] [service.go:296] EOF
考虑到在参考其他人方案改为kcp模式时也会出现i/o deadline问题,这里我修改了读取的时间,从原来的10秒改为60秒:
_ = conn.SetReadDeadline(time.Now().Add(60 * time.Second))
if err = msg.ReadMsgInto(conn, &loginRespMsg); err != nil {
xl.Info("Throwed from line:294, the origin error msg is: ")
xl.Info(err.Error())
return
}
修改时间这个方式不一定是最后跑通的关键,建议先往后看。
修改后再编译执行,发现还是EOF,而且跟时间无关,日志仍然是秒报错。
2023/08/14 14:27:41 [I] [root.go:220] start frpc service for config file [./frpc.ini]
2023/08/14 14:27:41 [I] [service.go:295] Throwed from line:294, the origin error msg is:
2023/08/14 14:27:41 [I] [service.go:296] EOF
2023/08/14 14:27:41 [W] [service.go:133] login to server failed: EOF
这个时候我(大概,可能,记不太清了)注意到了有这么一行注释 提到了Mux配置,我立即检查了我的服务端,发现我的服务端配置的tcp_mux
选项(文档链接)我设置成了false,而客户端默认没有使用时则为true,这里不匹配导致了conn的连接问题。
都更改为true之后,EOF问题解决,成功连接。